Skip to content
  1. Jul 10, 2023
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 76487845
      Linus Torvalds authored
      Pull xfs fix from Darrick Wong:
       "Nothing exciting here, just getting rid of a gcc warning that I got
        tired of seeing when I turn on gcov"
      
      * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix uninit warning in xfs_growfs_data
      76487845
    • Linus Torvalds's avatar
      Merge tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 4770353b
      Linus Torvalds authored
      Pull more smb client updates from Steve French:
      
       - fix potential use after free in unmount
      
       - minor cleanup
      
       - add worker to cleanup stale directory leases
      
      * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Add a laundromat thread for cached directories
        smb: client: remove redundant pointer 'server'
        cifs: fix session state transition to avoid use-after-free issue
      4770353b
    • Linus Torvalds's avatar
      Merge tag 'ntb-6.5' of https://github.com/jonmason/ntb · cff06873
      Linus Torvalds authored
      Pull NTB updates from Jon Mason:
       "Fixes for pci_clean_master, error handling in driver inits, and
        various other issues/bugs"
      
      * tag 'ntb-6.5' of https://github.com/jonmason/ntb:
        ntb: hw: amd: Fix debugfs_create_dir error checking
        ntb.rst: Fix copy and paste error
        ntb_netdev: Fix module_init problem
        ntb: intel: Remove redundant pci_clear_master
        ntb: epf: Remove redundant pci_clear_master
        ntb_hw_amd: Remove redundant pci_clear_master
        ntb: idt: drop redundant pci_enable_pcie_error_reporting()
        MAINTAINERS: git://github -> https://github.com for jonmason
        NTB: EPF: fix possible memory leak in pci_vntb_probe()
        NTB: ntb_tool: Add check for devm_kcalloc
        NTB: ntb_transport: fix possible memory leak while device_register() fails
        ntb: intel: Fix error handling in intel_ntb_pci_driver_init()
        NTB: amd: Fix error handling in amd_ntb_pci_driver_init()
        ntb: idt: Fix error handling in idt_pci_driver_init()
      cff06873
  2. Jul 09, 2023
    • Hugh Dickins's avatar
      mm: lock newly mapped VMA with corrected ordering · 1c7873e3
      Hugh Dickins authored
      Lockdep is certainly right to complain about
      
        (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
                       but task is already holding lock:
        (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db
      
      Invert those to the usual ordering.
      
      Fixes: 33313a74
      
       ("mm: lock newly mapped VMA which can be modified after it becomes visible")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Tested-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c7873e3
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of... · 946c6b59
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull hotfixes from Andrew Morton:
       "16 hotfixes. Six are cc:stable and the remainder address post-6.4
        issues"
      
      The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since
      it was all hopefully fixed in mainline.
      
      * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        lib: dhry: fix sleeping allocations inside non-preemptable section
        kasan, slub: fix HW_TAGS zeroing with slub_debug
        kasan: fix type cast in memory_is_poisoned_n
        mailmap: add entries for Heiko Stuebner
        mailmap: update manpage link
        bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
        MAINTAINERS: add linux-next info
        mailmap: add Markus Schneider-Pargmann
        writeback: account the number of pages written back
        mm: call arch_swap_restore() from do_swap_page()
        squashfs: fix cache race with migration
        mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
        docs: update ocfs2-devel mailing list address
        MAINTAINERS: update ocfs2-devel mailing list address
        mm: disable CONFIG_PER_VMA_LOCK until its fixed
        fork: lock VMAs of the parent process when forking
      946c6b59
    • Suren Baghdasaryan's avatar
      fork: lock VMAs of the parent process when forking · fb49c455
      Suren Baghdasaryan authored
      
      
      When forking a child process, the parent write-protects anonymous pages
      and COW-shares them with the child being forked using copy_present_pte().
      
      We must not take any concurrent page faults on the source vma's as they
      are being processed, as we expect both the vma and the pte's behind it
      to be stable.  For example, the anon_vma_fork() expects the parents
      vma->anon_vma to not change during the vma copy.
      
      A concurrent page fault on a page newly marked read-only by the page
      copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
      source vma, defeating the anon_vma_clone() that wasn't done because the
      parent vma originally didn't have an anon_vma, but we now might end up
      copying a pte entry for a page that has one.
      
      Before the per-vma lock based changes, the mmap_lock guaranteed
      exclusion with concurrent page faults.  But now we need to do a
      vma_start_write() to make sure no concurrent faults happen on this vma
      while it is being processed.
      
      This fix can potentially regress some fork-heavy workloads.  Kernel
      build time did not show noticeable regression on a 56-core machine while
      a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
      shows ~5% regression.  If such fork time regression is unacceptable,
      disabling CONFIG_PER_VMA_LOCK should restore its performance.  Further
      optimizations are possible if this regression proves to be problematic.
      
      Suggested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
      Reported-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
      Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
      Reported-by: default avatarJacob Young <jacobly.alt@gmail.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
      Fixes: 0bff0aae
      
       ("x86/mm: try VMA lock-based page fault handling first")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb49c455
    • Suren Baghdasaryan's avatar
      mm: lock newly mapped VMA which can be modified after it becomes visible · 33313a74
      Suren Baghdasaryan authored
      
      
      mmap_region adds a newly created VMA into VMA tree and might modify it
      afterwards before dropping the mmap_lock.  This poses a problem for page
      faults handled under per-VMA locks because they don't take the mmap_lock
      and can stumble on this VMA while it's still being modified.  Currently
      this does not pose a problem since post-addition modifications are done
      only for file-backed VMAs, which are not handled under per-VMA lock.
      However, once support for handling file-backed page faults with per-VMA
      locks is added, this will become a race.
      
      Fix this by write-locking the VMA before inserting it into the VMA tree.
      Other places where a new VMA is added into VMA tree do not modify it
      after the insertion, so do not need the same locking.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33313a74
    • Suren Baghdasaryan's avatar
      mm: lock a vma before stack expansion · c137381f
      Suren Baghdasaryan authored
      
      
      With recent changes necessitating mmap_lock to be held for write while
      expanding a stack, per-VMA locks should follow the same rules and be
      write-locked to prevent page faults into the VMA being expanded. Add
      the necessary locking.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c137381f
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 7fcd473a
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "A few late arriving patches that missed the initial pull request. It's
        mostly bug fixes (the dt-bindings is a fix for the initial pull)"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Remove unused function declaration
        scsi: target: docs: Remove tcm_mod_builder.py
        scsi: target: iblock: Quiet bool conversion warning with pr_preempt use
        scsi: dt-bindings: ufs: qcom: Fix ICE phandle
        scsi: core: Simplify scsi_cdl_check_cmd()
        scsi: isci: Fix comment typo
        scsi: smartpqi: Replace one-element arrays with flexible-array members
        scsi: target: tcmu: Replace strlcpy() with strscpy()
        scsi: ncr53c8xx: Replace strlcpy() with strscpy()
        scsi: lpfc: Fix lpfc_name struct packing
      7fcd473a
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 84dc5aa3
      Linus Torvalds authored
      Pull more i2c updates from Wolfram Sang:
      
       - xiic patch should have been in the original pull but slipped through
      
       - mpc patch fixes a build regression
      
       - nomadik cleanup
      
      * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mpc: Drop unused variable
        i2c: nomadik: Remove a useless call in the remove function
        i2c: xiic: Don't try to handle more interrupt events after error
      84dc5aa3
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 8fc3b8f0
      Linus Torvalds authored
      Pull hardening fixes from Kees Cook:
      
       - Check for NULL bdev in LoadPin (Matthias Kaehlcke)
      
       - Revert unwanted KUnit FORTIFY build default
      
       - Fix 1-element array causing boot warnings with xhci-hub
      
      * tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array
        Revert "fortify: Allow KUnit test to build without FORTIFY"
        dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
      8fc3b8f0
    • Anup Sharma's avatar
      ntb: hw: amd: Fix debugfs_create_dir error checking · bff6efc5
      Anup Sharma authored
      
      
      The debugfs_create_dir function returns ERR_PTR in case of error, and the
      only correct way to check if an error occurred is 'IS_ERR' inline function.
      This patch will replace the null-comparison with IS_ERR.
      
      Signed-off-by: default avatarAnup Sharma <anupnewsmail@gmail.com>
      Suggested-by: default avatarIvan Orlov <ivan.orlov0322@gmail.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      bff6efc5
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of... · c206353d
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next
      
      Pull more perf tools updates from Namhyung Kim:
       "These are remaining changes and fixes for this cycle.
      
        Build:
      
         - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1`
           and skip if the vmlinux has no BTF.
      
         - Replace deprecated clang -target xxx option by --target=xxx.
      
        perf record:
      
         - Print event attributes with well known type and config symbols in
           the debug output like below:
      
             # perf record -e cycles,cpu-clock -C0 -vv true
             <SNIP>
             ------------------------------------------------------------
             perf_event_attr:
               type                             0 (PERF_TYPE_HARDWARE)
               size                             136
               config                           0 (PERF_COUNT_HW_CPU_CYCLES)
               { sample_period, sample_freq }   4000
               sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
               read_format                      ID
               disabled                         1
               inherit                          1
               freq                             1
               sample_id_all                    1
               exclude_guest                    1
             ------------------------------------------------------------
             sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
             ------------------------------------------------------------
             perf_event_attr:
               type                             1 (PERF_TYPE_SOFTWARE)
               size                             136
               config                           0 (PERF_COUNT_SW_CPU_CLOCK)
               { sample_period, sample_freq }   4000
               sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
               read_format                      ID
               disabled                         1
               inherit                          1
               freq                             1
               sample_id_all                    1
               exclude_guest                    1
      
         - Update AMD IBS event error message since it now support per-process
           profiling but no priviledge filters.
      
             $ sudo perf record -e ibs_op//k -C 0
             Error:
             AMD IBS doesn't support privilege filtering. Try again without
             the privilege modifiers (like 'k') at the end.
      
        perf lock contention:
      
         - Support CSV style output using -x option
      
             $ sudo perf lock con -ab -x, sleep 1
             # output: contended, total wait, max wait, avg wait, type, caller
             19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0
             15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e
             4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d
             1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135
             8, 67608, 27404, 8451, spinlock, __queue_work+0x174
             3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff
             3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248
             2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad
             1, 24619, 24619, 24619, spinlock, rcu_core+0xd4
      
         - Add --output option to save the data to a file not to be interfered
           by other debug messages.
      
        Test:
      
         - Fix event parsing test on ARM where there's no raw PMU nor supports
           PERF_PMU_CAP_EXTENDED_HW_TYPE.
      
         - Update the lock contention test case for CSV output.
      
         - Fix a segfault in the daemon command test.
      
        Vendor events (JSON):
      
         - Add has_event() to check if the given event is available on system
           at runtime. On Intel machines, some transaction events may not be
           present when TSC extensions are disabled.
      
         - Update Intel event metrics.
      
        Misc:
      
         - Sort symbols by name using an external array of pointers instead of
           a rbtree node in the symbol. This will save 16-bytes or 24-bytes
           per symbol whether the sorting is actually requested or not.
      
         - Fix unwinding DWARF callstacks using libdw when --symfs option is
           used"
      
      * tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits)
        perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported.
        perf test: Fix event parsing test on Arm
        perf evsel amd: Fix IBS error message
        perf: unwind: Fix symfs with libdw
        perf symbol: Fix uninitialized return value in symbols__find_by_name()
        perf test: Test perf lock contention CSV output
        perf lock contention: Add --output option
        perf lock contention: Add -x option for CSV style output
        perf lock: Remove stale comments
        perf vendor events intel: Update tigerlake to 1.13
        perf vendor events intel: Update skylakex to 1.31
        perf vendor events intel: Update skylake to 57
        perf vendor events intel: Update sapphirerapids to 1.14
        perf vendor events intel: Update icelakex to 1.21
        perf vendor events intel: Update icelake to 1.19
        perf vendor events intel: Update cascadelakex to 1.19
        perf vendor events intel: Update meteorlake to 1.03
        perf vendor events intel: Add rocketlake events/metrics
        perf vendor metrics intel: Make transaction metrics conditional
        perf jevents: Support for has_event function
        ...
      c206353d
    • Linus Torvalds's avatar
      Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linux · ad8258e8
      Linus Torvalds authored
      Pull bitmap updates from Yury Norov:
       "Fixes for different bitmap pieces:
      
         - lib/test_bitmap: increment failure counter properly
      
           The tests that don't use expect_eq() macro to determine that a test
           is failured must increment failed_tests explicitly.
      
         - lib/bitmap: drop optimization of bitmap_{from,to}_arr64
      
           bitmap_{from,to}_arr64() optimization is overly optimistic
           on 32-bit LE architectures when it's wired to
           bitmap_copy_clear_tail().
      
         - nodemask: Drop duplicate check in for_each_node_mask()
      
           As the return value type of first_node() became unsigned, the node
           >= 0 became unnecessary.
      
         - cpumask: fix function description kernel-doc notation
      
         - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record
      
           Add linux/bits.h and linux/bitfield.h for visibility"
      
      * tag 'bitmap-6.5-rc1' of https://github.com/norov/linux:
        MAINTAINERS: Add bitfield.h to the BITMAP API record
        MAINTAINERS: Add bits.h to the BITMAP API record
        cpumask: fix function description kernel-doc notation
        nodemask: Drop duplicate check in for_each_node_mask()
        lib/bitmap: drop optimization of bitmap_{from,to}_arr64
        lib/test_bitmap: increment failure counter properly
      ad8258e8
    • Geert Uytterhoeven's avatar
      lib: dhry: fix sleeping allocations inside non-preemptable section · 8ba388c0
      Geert Uytterhoeven authored
      The Smatch static checker reports the following warnings:
      
          lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context
          lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context
      
      Indeed, dhry() does sleeping allocations inside the non-preemptable
      section delimited by get_cpu()/put_cpu().
      
      Fix this by using atomic allocations instead.
      Add error handling, as atomic these allocations may fail.
      
      Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be
      Fixes: 13684e96
      
       ("lib: dhry: fix unstable smp_processor_id(_) usage")
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Closes: https://lore.kernel.org/r/0469eb3a-02eb-4b41-b189-de20b931fa56@moroto.mountain
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8ba388c0
    • Andrey Konovalov's avatar
      kasan, slub: fix HW_TAGS zeroing with slub_debug · fdb54d96
      Andrey Konovalov authored
      Commit 946fa0db ("mm/slub: extend redzone check to extra allocated
      kmalloc space than requested") added precise kmalloc redzone poisoning to
      the slub_debug functionality.
      
      However, this commit didn't account for HW_TAGS KASAN fully initializing
      the object via its built-in memory initialization feature.  Even though
      HW_TAGS KASAN memory initialization contains special memory initialization
      handling for when slub_debug is enabled, it does not account for in-object
      slub_debug redzones.  As a result, HW_TAGS KASAN can overwrite these
      redzones and cause false-positive slub_debug reports.
      
      To fix the issue, avoid HW_TAGS KASAN memory initialization when
      slub_debug is enabled altogether.  Implement this by moving the
      __slub_debug_enabled check to slab_post_alloc_hook.  Common slab code
      seems like a more appropriate place for a slub_debug check anyway.
      
      Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.1688561016.git.andreyknvl@google.com
      Fixes: 946fa0db
      
       ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: kasan-dev@googlegroups.com
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fdb54d96
    • Andrey Konovalov's avatar
      kasan: fix type cast in memory_is_poisoned_n · 05c56e7b
      Andrey Konovalov authored
      Commit bb6e04a1 ("kasan: use internal prototypes matching gcc-13
      builtins") introduced a bug into the memory_is_poisoned_n implementation:
      it effectively removed the cast to a signed integer type after applying
      KASAN_GRANULE_MASK.
      
      As a result, KASAN started failing to properly check memset, memcpy, and
      other similar functions.
      
      Fix the bug by adding the cast back (through an additional signed integer
      variable to make the code more readable).
      
      Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.1688431866.git.andreyknvl@google.com
      Fixes: bb6e04a1
      
       ("kasan: use internal prototypes matching gcc-13 builtins")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      05c56e7b
    • Heiko Stuebner's avatar
      mailmap: add entries for Heiko Stuebner · d3a808ec
      Heiko Stuebner authored
      
      
      I am going to lose my vrull.eu address at the end of july, and while
      adding it to mailmap I also realised that there are more old addresses
      from me dangling, so update .mailmap for all of them.
      
      Link: https://lkml.kernel.org/r/20230704163919.1136784-3-heiko@sntech.de
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarHeiko Stuebner <heiko.stuebner@vrull.eu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d3a808ec
    • Heiko Stuebner's avatar
      mailmap: update manpage link · ddcd91f4
      Heiko Stuebner authored
      
      
      Patch series "Update .mailmap for my work address and fix manpage".
      
      While updating mailmap for the going-away address, I also found that on
      current systems the manpage linked from the header comment changed.
      
      And in fact it looks like the git mailmap feature got its own manpage.
      
      
      This patch (of 2):
      
      On recent systems the git-shortlog manpage only tells people to
          See gitmailmap(5)
      
      So instead of sending people on a scavenger hunt, put that info into the
      header directly.  Though keep the old reference around for older systems.
      
      Link: https://lkml.kernel.org/r/20230704163919.1136784-1-heiko@sntech.de
      Link: https://lkml.kernel.org/r/20230704163919.1136784-2-heiko@sntech.de
      Signed-off-by: default avatarHeiko Stuebner <heiko.stuebner@vrull.eu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ddcd91f4
    • Liu Shixin's avatar
      bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page · 028725e7
      Liu Shixin authored
      commit dd0ff4d1 ("bootmem: remove the vmemmap pages from kmemleak in
      put_page_bootmem") fix an overlaps existing problem of kmemleak.  But the
      problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in
      this case, free_bootmem_page() will call free_reserved_page() directly.
      
      Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when
      HAVE_BOOTMEM_INFO_NODE is disabled.
      
      Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com
      Fixes: f41f2ed4
      
       ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page")
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Acked-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      028725e7
    • Randy Dunlap's avatar
      MAINTAINERS: add linux-next info · 0d707cde
      Randy Dunlap authored
      
      
      Add linux-next info to MAINTAINERS for ease of finding this data.
      
      Link: https://lkml.kernel.org/r/20230704054410.12527-1-rdunlap@infradead.org
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0d707cde
    • Markus Schneider-Pargmann's avatar
      mailmap: add Markus Schneider-Pargmann · 6dedd768
      Markus Schneider-Pargmann authored
      
      
      Add my old mail address and update my name.
      
      Link: https://lkml.kernel.org/r/20230628081341.3470229-1-msp@baylibre.com
      Signed-off-by: default avatarMarkus Schneider-Pargmann <msp@baylibre.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6dedd768
    • Matthew Wilcox (Oracle)'s avatar
      writeback: account the number of pages written back · 8344a3d4
      Matthew Wilcox (Oracle) authored
      nr_to_write is a count of pages, so we need to decrease it by the number
      of pages in the folio we just wrote, not by 1.  Most callers specify
      either LONG_MAX or 1, so are unaffected, but writeback_sb_inodes() might
      end up writing 512x as many pages as it asked for.
      
      Dave added:
      
      : XFS is the only filesystem this would affect, right?  AFAIA, nothing
      : else enables large folios and uses writeback through
      : write_cache_pages() at this point...
      : 
      : In which case, I'd be surprised if much difference, if any, gets
      : noticed by anyone.
      
      Link: https://lkml.kernel.org/r/20230628185548.981888-1-willy@infradead.org
      Fixes: 793917d9
      
       ("mm/readahead: Add large folio readahead")
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8344a3d4
    • Peter Collingbourne's avatar
      mm: call arch_swap_restore() from do_swap_page() · 6dca4ac6
      Peter Collingbourne authored
      Commit c145e0b4 ("mm: streamline COW logic in do_swap_page()") moved
      the call to swap_free() before the call to set_pte_at(), which meant that
      the MTE tags could end up being freed before set_pte_at() had a chance to
      restore them.  Fix it by adding a call to the arch_swap_restore() hook
      before the call to swap_free().
      
      Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
      Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
      Fixes: c145e0b4
      
       ("mm: streamline COW logic in do_swap_page()")
      Signed-off-by: default avatarPeter Collingbourne <pcc@google.com>
      Reported-by: default avatarQun-wei Lin <Qun-wei.Lin@mediatek.com>
      Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com/
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: <stable@vger.kernel.org>	[6.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6dca4ac6
    • Vincent Whitchurch's avatar
      squashfs: fix cache race with migration · 08bab74a
      Vincent Whitchurch authored
      Migration replaces the page in the mapping before copying the contents and
      the flags over from the old page, so check that the page in the page cache
      is really up to date before using it.  Without this, stressing squashfs
      reads with parallel compaction sometimes results in squashfs reporting
      data corruption.
      
      Link: https://lkml.kernel.org/r/20230629-squashfs-cache-migration-v1-1-d50ebe55099d@axis.com
      Fixes: e994f5b6
      
       ("squashfs: cache partial compressed blocks")
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      08bab74a
    • John Hubbard's avatar
      mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison · 191fcdb6
      John Hubbard authored
      The following crash happens for me when running the -mm selftests (below).
      Specifically, it happens while running the uffd-stress subtests:
      
      kernel BUG at mm/hugetlb.c:7249!
      invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109
      Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018
      RIP: 0010:huge_pte_alloc+0x12c/0x1a0
      ...
      Call Trace:
       <TASK>
       ? __die_body+0x63/0xb0
       ? die+0x9f/0xc0
       ? do_trap+0xab/0x180
       ? huge_pte_alloc+0x12c/0x1a0
       ? do_error_trap+0xc6/0x110
       ? huge_pte_alloc+0x12c/0x1a0
       ? handle_invalid_op+0x2c/0x40
       ? huge_pte_alloc+0x12c/0x1a0
       ? exc_invalid_op+0x33/0x50
       ? asm_exc_invalid_op+0x16/0x20
       ? __pfx_put_prev_task_idle+0x10/0x10
       ? huge_pte_alloc+0x12c/0x1a0
       hugetlb_fault+0x1a3/0x1120
       ? finish_task_switch+0xb3/0x2a0
       ? lock_is_held_type+0xdb/0x150
       handle_mm_fault+0xb8a/0xd40
       ? find_vma+0x5d/0xa0
       do_user_addr_fault+0x257/0x5d0
       exc_page_fault+0x7b/0x1f0
       asm_exc_page_fault+0x22/0x30
      
      That happens because a BUG() statement in huge_pte_alloc() attempts to
      check that a pte, if present, is a hugetlb pte, but it does so in a
      non-lockless-safe manner that leads to a false BUG() report.
      
      We got here due to a couple of bugs, each of which by itself was not quite
      enough to cause a problem:
      
      First of all, before commit c33c7948("mm: ptep_get() conversion"), the
      BUG() statement in huge_pte_alloc() was itself fragile: it relied upon
      compiler behavior to only read the pte once, despite using it twice in the
      same conditional.
      
      Next, commit c33c7948 ("mm: ptep_get() conversion") broke that
      delicate situation, by causing all direct pte reads to be done via
      READ_ONCE().  And so READ_ONCE() got called twice within the same BUG()
      conditional, leading to comparing (potentially, occasionally) different
      versions of the pte, and thus to false BUG() reports.
      
      Fix this by taking a single snapshot of the pte before using it in the
      BUG conditional.
      
      Now, that commit is only partially to blame here but, people doing
      bisections will invariably land there, so this will help them find a fix
      for a real crash.  And also, the previous behavior was unlikely to ever
      expose this bug--it was fragile, yet not actually broken.
      
      So that's why I chose this commit for the Fixes tag, rather than the
      commit that created the original BUG() statement.
      
      Link: https://lkml.kernel.org/r/20230701010442.2041858-1-jhubbard@nvidia.com
      Fixes: c33c7948
      
       ("mm: ptep_get() conversion")
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarJames Houghton <jthoughton@google.com>
      Acked-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Acked-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: SeongJae Park <sj@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      191fcdb6
    • Anthony Iliopoulos's avatar
      docs: update ocfs2-devel mailing list address · 5a569db6
      Anthony Iliopoulos authored
      
      
      The ocfs2-devel mailing list has been migrated to the kernel.org
      infrastructure, update all related documentation pointers to reflect the
      change.
      
      Link: https://lkml.kernel.org/r/20230628013437.47030-3-ailiop@suse.com
      Signed-off-by: default avatarAnthony Iliopoulos <ailiop@suse.com>
      Acked-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5a569db6
    • Anthony Iliopoulos's avatar
      MAINTAINERS: update ocfs2-devel mailing list address · a57b4b7f
      Anthony Iliopoulos authored
      
      
      The ocfs2-devel mailing list has been migrated to the kernel.org
      infrastructure, update the related entry to reflect the change.
      
      Link: https://lkml.kernel.org/r/20230628013437.47030-2-ailiop@suse.com
      Signed-off-by: default avatarAnthony Iliopoulos <ailiop@suse.com>
      Acked-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a57b4b7f
    • Suren Baghdasaryan's avatar
      mm: disable CONFIG_PER_VMA_LOCK until its fixed · f96c4867
      Suren Baghdasaryan authored
      
      
      A memory corruption was reported in [1] with bisection pointing to the
      patch [2] enabling per-VMA locks for x86.  Disable per-VMA locks config to
      prevent this issue until the fix is confirmed.  This is expected to be a
      temporary measure.
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
      [2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
      
      Link: https://lkml.kernel.org/r/20230706011400.2949242-3-surenb@google.com
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
      Reported-by: default avatarJacob Young <jacobly.alt@gmail.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
      Fixes: 0bff0aae
      
       ("x86/mm: try VMA lock-based page fault handling first")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f96c4867
    • Suren Baghdasaryan's avatar
      fork: lock VMAs of the parent process when forking · 2b4f3b49
      Suren Baghdasaryan authored
      Patch series "Avoid memory corruption caused by per-VMA locks", v4.
      
      A memory corruption was reported in [1] with bisection pointing to the
      patch [2] enabling per-VMA locks for x86.  Based on the reproducer
      provided in [1] we suspect this is caused by the lack of VMA locking while
      forking a child process.
      
      Patch 1/2 in the series implements proper VMA locking during fork.  I
      tested the fix locally using the reproducer and was unable to reproduce
      the memory corruption problem.
      
      This fix can potentially regress some fork-heavy workloads.  Kernel build
      time did not show noticeable regression on a 56-core machine while a
      stress test mapping 10000 VMAs and forking 5000 times in a tight loop
      shows ~7% regression.  If such fork time regression is unacceptable,
      disabling CONFIG_PER_VMA_LOCK should restore its performance.  Further
      optimizations are possible if this regression proves to be problematic.
      
      Patch 2/2 disables per-VMA locks until the fix is tested and verified.
      
      
      This patch (of 2):
      
      When forking a child process, parent write-protects an anonymous page and
      COW-shares it with the child being forked using copy_present_pte(). 
      Parent's TLB is flushed right before we drop the parent's mmap_lock in
      dup_mmap().  If we get a write-fault before that TLB flush in the parent,
      and we end up replacing that anonymous page in the parent process in
      do_wp_page() (because, COW-shared with the child), this might lead to some
      stale writable TLB entries targeting the wrong (old) page.  Similar issue
      happened in the past with userfaultfd (see flush_tlb_page() call inside
      do_wp_page()).
      
      Lock VMAs of the parent process when forking a child, which prevents
      concurrent page faults during fork operation and avoids this issue.  This
      fix can potentially regress some fork-heavy workloads.  Kernel build time
      did not show noticeable regression on a 56-core machine while a stress
      test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~7%
      regression.  If such fork time regression is unacceptable, disabling
      CONFIG_PER_VMA_LOCK should restore its performance.  Further optimizations
      are possible if this regression proves to be problematic.
      
      Link: https://lkml.kernel.org/r/20230706011400.2949242-1-surenb@google.com
      Link: https://lkml.kernel.org/r/20230706011400.2949242-2-surenb@google.com
      Fixes: 0bff0aae
      
       ("x86/mm: try VMA lock-based page fault handling first")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Suggested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
      Reported-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
      Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
      Reported-by: default avatarJacob Young <jacobly.alt@gmail.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624
      Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Tested-by: default avatarHolger Hoffsttte <holger@applied-asynchrony.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2b4f3b49
  3. Jul 08, 2023