Skip to content
  1. Jul 09, 2023
    • Suren Baghdasaryan's avatar
      fork: lock VMAs of the parent process when forking · fb49c455
      Suren Baghdasaryan authored
      
      
      When forking a child process, the parent write-protects anonymous pages
      and COW-shares them with the child being forked using copy_present_pte().
      
      We must not take any concurrent page faults on the source vma's as they
      are being processed, as we expect both the vma and the pte's behind it
      to be stable.  For example, the anon_vma_fork() expects the parents
      vma->anon_vma to not change during the vma copy.
      
      A concurrent page fault on a page newly marked read-only by the page
      copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
      source vma, defeating the anon_vma_clone() that wasn't done because the
      parent vma originally didn't have an anon_vma, but we now might end up
      copying a pte entry for a page that has one.
      
      Before the per-vma lock based changes, the mmap_lock guaranteed
      exclusion with concurrent page faults.  But now we need to do a
      vma_start_write() to make sure no concurrent faults happen on this vma
      while it is being processed.
      
      This fix can potentially regress some fork-heavy workloads.  Kernel
      build time did not show noticeable regression on a 56-core machine while
      a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
      shows ~5% regression.  If such fork time regression is unacceptable,
      disabling CONFIG_PER_VMA_LOCK should restore its performance.  Further
      optimizations are possible if this regression proves to be problematic.
      
      Suggested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
      
      
      Reported-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
      Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
      
      
      Reported-by: default avatarJacob Young <jacobly.alt@gmail.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
      Fixes: 0bff0aae
      
       ("x86/mm: try VMA lock-based page fault handling first")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb49c455
    • Suren Baghdasaryan's avatar
      mm: lock newly mapped VMA which can be modified after it becomes visible · 33313a74
      Suren Baghdasaryan authored
      
      
      mmap_region adds a newly created VMA into VMA tree and might modify it
      afterwards before dropping the mmap_lock.  This poses a problem for page
      faults handled under per-VMA locks because they don't take the mmap_lock
      and can stumble on this VMA while it's still being modified.  Currently
      this does not pose a problem since post-addition modifications are done
      only for file-backed VMAs, which are not handled under per-VMA lock.
      However, once support for handling file-backed page faults with per-VMA
      locks is added, this will become a race.
      
      Fix this by write-locking the VMA before inserting it into the VMA tree.
      Other places where a new VMA is added into VMA tree do not modify it
      after the insertion, so do not need the same locking.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33313a74
    • Suren Baghdasaryan's avatar
      mm: lock a vma before stack expansion · c137381f
      Suren Baghdasaryan authored
      
      
      With recent changes necessitating mmap_lock to be held for write while
      expanding a stack, per-VMA locks should follow the same rules and be
      write-locked to prevent page faults into the VMA being expanded. Add
      the necessary locking.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c137381f
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 7fcd473a
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "A few late arriving patches that missed the initial pull request. It's
        mostly bug fixes (the dt-bindings is a fix for the initial pull)"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Remove unused function declaration
        scsi: target: docs: Remove tcm_mod_builder.py
        scsi: target: iblock: Quiet bool conversion warning with pr_preempt use
        scsi: dt-bindings: ufs: qcom: Fix ICE phandle
        scsi: core: Simplify scsi_cdl_check_cmd()
        scsi: isci: Fix comment typo
        scsi: smartpqi: Replace one-element arrays with flexible-array members
        scsi: target: tcmu: Replace strlcpy() with strscpy()
        scsi: ncr53c8xx: Replace strlcpy() with strscpy()
        scsi: lpfc: Fix lpfc_name struct packing
      7fcd473a
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 84dc5aa3
      Linus Torvalds authored
      Pull more i2c updates from Wolfram Sang:
      
       - xiic patch should have been in the original pull but slipped through
      
       - mpc patch fixes a build regression
      
       - nomadik cleanup
      
      * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mpc: Drop unused variable
        i2c: nomadik: Remove a useless call in the remove function
        i2c: xiic: Don't try to handle more interrupt events after error
      84dc5aa3
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 8fc3b8f0
      Linus Torvalds authored
      Pull hardening fixes from Kees Cook:
      
       - Check for NULL bdev in LoadPin (Matthias Kaehlcke)
      
       - Revert unwanted KUnit FORTIFY build default
      
       - Fix 1-element array causing boot warnings with xhci-hub
      
      * tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array
        Revert "fortify: Allow KUnit test to build without FORTIFY"
        dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
      8fc3b8f0
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of... · c206353d
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next
      
      Pull more perf tools updates from Namhyung Kim:
       "These are remaining changes and fixes for this cycle.
      
        Build:
      
         - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1`
           and skip if the vmlinux has no BTF.
      
         - Replace deprecated clang -target xxx option by --target=xxx.
      
        perf record:
      
         - Print event attributes with well known type and config symbols in
           the debug output like below:
      
             # perf record -e cycles,cpu-clock -C0 -vv true
             <SNIP>
             ------------------------------------------------------------
             perf_event_attr:
               type                             0 (PERF_TYPE_HARDWARE)
               size                             136
               config                           0 (PERF_COUNT_HW_CPU_CYCLES)
               { sample_period, sample_freq }   4000
               sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
               read_format                      ID
               disabled                         1
               inherit                          1
               freq                             1
               sample_id_all                    1
               exclude_guest                    1
             ------------------------------------------------------------
             sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
             ------------------------------------------------------------
             perf_event_attr:
               type                             1 (PERF_TYPE_SOFTWARE)
               size                             136
               config                           0 (PERF_COUNT_SW_CPU_CLOCK)
               { sample_period, sample_freq }   4000
               sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
               read_format                      ID
               disabled                         1
               inherit                          1
               freq                             1
               sample_id_all                    1
               exclude_guest                    1
      
         - Update AMD IBS event error message since it now support per-process
           profiling but no priviledge filters.
      
             $ sudo perf record -e ibs_op//k -C 0
             Error:
             AMD IBS doesn't support privilege filtering. Try again without
             the privilege modifiers (like 'k') at the end.
      
        perf lock contention:
      
         - Support CSV style output using -x option
      
             $ sudo perf lock con -ab -x, sleep 1
             # output: contended, total wait, max wait, avg wait, type, caller
             19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0
             15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e
             4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d
             1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135
             8, 67608, 27404, 8451, spinlock, __queue_work+0x174
             3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff
             3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248
             2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad
             1, 24619, 24619, 24619, spinlock, rcu_core+0xd4
      
         - Add --output option to save the data to a file not to be interfered
           by other debug messages.
      
        Test:
      
         - Fix event parsing test on ARM where there's no raw PMU nor supports
           PERF_PMU_CAP_EXTENDED_HW_TYPE.
      
         - Update the lock contention test case for CSV output.
      
         - Fix a segfault in the daemon command test.
      
        Vendor events (JSON):
      
         - Add has_event() to check if the given event is available on system
           at runtime. On Intel machines, some transaction events may not be
           present when TSC extensions are disabled.
      
         - Update Intel event metrics.
      
        Misc:
      
         - Sort symbols by name using an external array of pointers instead of
           a rbtree node in the symbol. This will save 16-bytes or 24-bytes
           per symbol whether the sorting is actually requested or not.
      
         - Fix unwinding DWARF callstacks using libdw when --symfs option is
           used"
      
      * tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits)
        perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported.
        perf test: Fix event parsing test on Arm
        perf evsel amd: Fix IBS error message
        perf: unwind: Fix symfs with libdw
        perf symbol: Fix uninitialized return value in symbols__find_by_name()
        perf test: Test perf lock contention CSV output
        perf lock contention: Add --output option
        perf lock contention: Add -x option for CSV style output
        perf lock: Remove stale comments
        perf vendor events intel: Update tigerlake to 1.13
        perf vendor events intel: Update skylakex to 1.31
        perf vendor events intel: Update skylake to 57
        perf vendor events intel: Update sapphirerapids to 1.14
        perf vendor events intel: Update icelakex to 1.21
        perf vendor events intel: Update icelake to 1.19
        perf vendor events intel: Update cascadelakex to 1.19
        perf vendor events intel: Update meteorlake to 1.03
        perf vendor events intel: Add rocketlake events/metrics
        perf vendor metrics intel: Make transaction metrics conditional
        perf jevents: Support for has_event function
        ...
      c206353d
    • Linus Torvalds's avatar
      Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linux · ad8258e8
      Linus Torvalds authored
      Pull bitmap updates from Yury Norov:
       "Fixes for different bitmap pieces:
      
         - lib/test_bitmap: increment failure counter properly
      
           The tests that don't use expect_eq() macro to determine that a test
           is failured must increment failed_tests explicitly.
      
         - lib/bitmap: drop optimization of bitmap_{from,to}_arr64
      
           bitmap_{from,to}_arr64() optimization is overly optimistic
           on 32-bit LE architectures when it's wired to
           bitmap_copy_clear_tail().
      
         - nodemask: Drop duplicate check in for_each_node_mask()
      
           As the return value type of first_node() became unsigned, the node
           >= 0 became unnecessary.
      
         - cpumask: fix function description kernel-doc notation
      
         - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record
      
           Add linux/bits.h and linux/bitfield.h for visibility"
      
      * tag 'bitmap-6.5-rc1' of https://github.com/norov/linux:
        MAINTAINERS: Add bitfield.h to the BITMAP API record
        MAINTAINERS: Add bits.h to the BITMAP API record
        cpumask: fix function description kernel-doc notation
        nodemask: Drop duplicate check in for_each_node_mask()
        lib/bitmap: drop optimization of bitmap_{from,to}_arr64
        lib/test_bitmap: increment failure counter properly
      ad8258e8
  2. Jul 08, 2023
    • Linus Torvalds's avatar
      Merge tag 'mmc-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 8689f4f2
      Linus Torvalds authored
      Pull mmc fix from Ulf Hansson:
      
       - Fix regression of detection of eMMC/SD/SDIO cards
      
      * tag 'mmc-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: Revert "mmc: core: Allow mmc_start_host() synchronously detect a card"
      8689f4f2
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 4c8ab068
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes that have been gathered recently:
      
         - Two code-typo fixes in the new UMP core
      
         - A fix in jack reporting to avoid the usage of mutex
      
         - A potential data race fix in HD-audio core regmap code
      
         - A potential data race fix in PCM allocation helper code
      
         - HD-audio quirks for ASUS, Clevo and Unis machines
      
         - Constifications in FireWire drivers"
      
      * tag 'sound-fix-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek: Add quirk for ASUS ROG GZ301V
        ALSA: jack: Fix mutex call in snd_jack_report()
        ALSA: seq: ump: fix typo in system_2p_ev_to_ump_midi1()
        ALSA: hda/realtek: Whitespace fix
        ALSA: hda/realtek: Add quirk for ASUS ROG G614Jx
        ALSA: hda/realtek: Amend G634 quirk to enable rear speakers
        ALSA: hda/realtek: Add quirk for ASUS ROG GA402X
        ALSA: hda/realtek: Add quirk for ASUS ROG GX650P
        ALSA: pcm: Fix potential data race at PCM memory allocation helpers
        ALSA: hda: fix a possible null-pointer dereference due to data race in snd_hdac_regmap_sync()
        ALSA: hda/realtek: Add quirks for Unis H3C Desktop B760 & Q760
        ALSA: hda/realtek: Add quirk for Clevo NPx0SNx
        ALSA: ump: Correct wrong byte size at converting a UMP System message
        ALSA: fireface: make read-only const array for model names static
        ALSA: oxfw: make read-only const array models static
      4c8ab068
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.5-rc1' of https://github.com/ceph/ceph-client · 3290badd
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A bunch of CephFS fixups from Xiubo, mostly around dropping caps,
        along with a fix for a regression in the readahead handling code which
        sneaked in with the switch to netfs helpers"
      
      * tag 'ceph-for-6.5-rc1' of https://github.com/ceph/ceph-client:
        ceph: don't let check_caps skip sending responses for revoke msgs
        ceph: issue a cap release immediately if no cap exists
        ceph: trigger to flush the buffer when making snapshot
        ceph: fix blindly expanding the readahead windows
        ceph: add a dedicated private data for netfs rreq
        ceph: voluntarily drop Xx caps for requests those touch parent mtime
        ceph: try to dump the msgs when decoding fails
        ceph: only send metrics when the MDS rank is ready
      3290badd
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.5' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 36b93aed
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
       "Updates:
         - support /proc/fs/ntfs3/<dev>/volinfo and label
         - alternative boot if primary boot is corrupted
         - small optimizations
      
        Fixes:
         - fix endian problems
         - fix logic errors
         - code refactoring and reformatting"
      
      * tag 'ntfs3_for_6.5' of https://github.com/Paragon-Software-Group/linux-ntfs3:
        fs/ntfs3: Correct mode for label entry inside /proc/fs/ntfs3/
        fs/ntfs3: Add support /proc/fs/ntfs3/<dev>/volinfo and /proc/fs/ntfs3/<dev>/label
        fs/ntfs3: Fix endian problem
        fs/ntfs3: Add ability to format new mft records with bigger/smaller header
        fs/ntfs3: Code refactoring
        fs/ntfs3: Code formatting
        fs/ntfs3: Do not update primary boot in ntfs_init_from_boot()
        fs/ntfs3: Alternative boot if primary boot is corrupted
        fs/ntfs3: Mark ntfs dirty when on-disk struct is corrupted
        fs/ntfs3: Fix ntfs_atomic_open
        fs/ntfs3: Correct checking while generating attr_list
        fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_load_attr_list()
        fs: ntfs3: Fix possible null-pointer dereferences in mi_read()
        fs/ntfs3: Return error for inconsistent extended attributes
        fs/ntfs3: Enhance sanity check while generating attr_list
        fs/ntfs3: Use wrapper i_blocksize() in ntfs_zero_range()
        ntfs: Fix panic about slab-out-of-bounds caused by ntfs_listxattr()
      36b93aed
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 986ffe60
      Linus Torvalds authored
      Pull fsnotify fix from Jan Kara:
       "A fix for fanotify to disallow creating of mount or superblock marks
        for kernel internal pseudo filesystems"
      
      * tag 'fsnotify_for_v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fanotify: disallow mount/sb marks on kernel internal pseudo fs
      986ffe60
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.5-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 4f6b6c2b
      Linus Torvalds authored
      Pull more RISC-V updates from Palmer Dabbelt:
      
       - A bunch of fixes/cleanups from the first part of the merge window,
         mostly related to ACPI and vector as those were large
      
       - Some documentation improvements, mostly related to the new code
      
       - The "riscv,isa" DT key is deprecated
      
       - Support for link-time dead code elimination
      
       - Support for minor fault registration in userfaultd
      
       - A handful of cleanups around CMO alternatives
      
      * tag 'riscv-for-linus-6.5-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (23 commits)
        riscv: mm: mark noncoherent_supported as __ro_after_init
        riscv: mm: mark CBO relate initialization funcs as __init
        riscv: errata: thead: only set cbom size & noncoherent during boot
        riscv: Select HAVE_ARCH_USERFAULTFD_MINOR
        RISC-V: Document the ISA string parsing rules for ACPI
        risc-v: Fix order of IPI enablement vs RCU startup
        mm: riscv: fix an unsafe pte read in huge_pte_alloc()
        dt-bindings: riscv: deprecate riscv,isa
        RISC-V: drop error print from riscv_hartid_to_cpuid()
        riscv: Discard vector state on syscalls
        riscv: move memblock_allow_resize() after linear mapping is ready
        riscv: Enable ARCH_SUSPEND_POSSIBLE for s2idle
        riscv: vdso: include vdso/vsyscall.h for vdso_data
        selftests: Test RISC-V Vector's first-use handler
        riscv: vector: clear V-reg in the first-use trap
        riscv: vector: only enable interrupts in the first-use trap
        RISC-V: Fix up some vector state related build failures
        RISC-V: Document that V registers are clobbered on syscalls
        riscv: disable HAVE_LD_DEAD_CODE_DATA_ELIMINATION for LLD
        riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
        ...
      4f6b6c2b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 22dcc7d7
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix PCIe MEM size for pci2 node on Turris 1.x boards
      
       - Two minor build fixes
      
      Thanks to Christophe Leroy, Douglas Anderson, Pali Rohár, Petr Mladek,
      and Randy Dunlap.
      
      * tag 'powerpc-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: dts: turris1x.dts: Fix PCIe MEM size for pci2 node
        powerpc: Include asm/nmi.c in mobility.c for watchdog_hardlockup_set_timeout_pct()
        powerpc: allow PPC_EARLY_DEBUG_CPM only when SERIAL_CPM=y
      22dcc7d7
    • Linus Torvalds's avatar
      Merge tag 'apparmor-pr-2023-07-06' of... · 70806ee1
      Linus Torvalds authored
      Merge tag 'apparmor-pr-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
      
      Pull apparmor updates from John Johansen:
      
       - fix missing error check for rhashtable_insert_fast
      
       - add missing failure check in compute_xmatch_perms
      
       - fix policy_compat permission remap with extended permissions
      
       - fix profile verification and enable it
      
       - fix kzalloc perms tables for shared dfas
      
       - Fix kernel-doc header for verify_dfa_accept_index
      
       - aa_buffer: Convert 1-element array to flexible array
      
       - Return directly after a failed kzalloc() in two functions
      
       - fix use of strcpy in policy_unpack_test
      
       - fix kernel-doc complaints
      
       - Fix some kernel-doc comments
      
      * tag 'apparmor-pr-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
        apparmor: Fix kernel-doc header for verify_dfa_accept_index
        apparmor: fix: kzalloc perms tables for shared dfas
        apparmor: fix profile verification and enable it
        apparmor: fix policy_compat permission remap with extended permissions
        apparmor: aa_buffer: Convert 1-element array to flexible array
        apparmor: add missing failure check in compute_xmatch_perms
        apparmor: fix missing error check for rhashtable_insert_fast
        apparmor: Return directly after a failed kzalloc() in two functions
        AppArmor: Fix some kernel-doc comments
        apparmor: fix use of strcpy in policy_unpack_test
        apparmor: fix kernel-doc complaints
      70806ee1
  3. Jul 07, 2023