Skip to content
  1. Feb 08, 2024
    • Oscar Salvador's avatar
      fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super · 79d72c68
      Oscar Salvador authored
      When configuring a hugetlb filesystem via the fsconfig() syscall, there is
      a possible NULL dereference in hugetlbfs_fill_super() caused by assigning
      NULL to ctx->hstate in hugetlbfs_parse_param() when the requested pagesize
      is non valid.
      
      E.g: Taking the following steps:
      
           fd = fsopen("hugetlbfs", FSOPEN_CLOEXEC);
           fsconfig(fd, FSCONFIG_SET_STRING, "pagesize", "1024", 0);
           fsconfig(fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
      
      Given that the requested "pagesize" is invalid, ctxt->hstate will be replaced
      with NULL, losing its previous value, and we will print an error:
      
       ...
       ...
       case Opt_pagesize:
       ps = memparse(param->string, &rest);
       ctx->hstate = h;
       if (!ctx->hstate) {
               pr_err("Unsupported page size %lu MB\n", ps / SZ_1M);
               return -EINVAL;
       }
       return 0;
       ...
       ...
      
      This is a problem because later on, we will dereference ctxt->hstate in
      hugetlbfs_fill_super()
      
       ...
       ...
       sb->s_blocksize = huge_page_size(ctx->hstate);
       ...
       ...
      
      Causing below Oops.
      
      Fix this by replacing cxt->hstate value only when then pagesize is known
      to be valid.
      
       kernel: hugetlbfs: Unsupported page size 0 MB
       kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
       kernel: #PF: supervisor read access in kernel mode
       kernel: #PF: error_code(0x0000) - not-present page
       kernel: PGD 800000010f66c067 P4D 800000010f66c067 PUD 1b22f8067 PMD 0
       kernel: Oops: 0000 [#1] PREEMPT SMP PTI
       kernel: CPU: 4 PID: 5659 Comm: syscall Tainted: G            E      6.8.0-rc2-default+ #22 5a47c3fef76212addcc6eb71344aabc35190ae8f
       kernel: Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017
       kernel: RIP: 0010:hugetlbfs_fill_super+0xb4/0x1a0
       kernel: Code: 48 8b 3b e8 3e c6 ed ff 48 85 c0 48 89 45 20 0f 84 d6 00 00 00 48 b8 ff ff ff ff ff ff ff 7f 4c 89 e7 49 89 44 24 20 48 8b 03 <8b> 48 28 b8 00 10 00 00 48 d3 e0 49 89 44 24 18 48 8b 03 8b 40 28
       kernel: RSP: 0018:ffffbe9960fcbd48 EFLAGS: 00010246
       kernel: RAX: 0000000000000000 RBX: ffff9af5272ae780 RCX: 0000000000372004
       kernel: RDX: ffffffffffffffff RSI: ffffffffffffffff RDI: ffff9af555e9b000
       kernel: RBP: ffff9af52ee66b00 R08: 0000000000000040 R09: 0000000000370004
       kernel: R10: ffffbe9960fcbd48 R11: 0000000000000040 R12: ffff9af555e9b000
       kernel: R13: ffffffffa66b86c0 R14: ffff9af507d2f400 R15: ffff9af507d2f400
       kernel: FS:  00007ffbc0ba4740(0000) GS:ffff9b0bd7000000(0000) knlGS:0000000000000000
       kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       kernel: CR2: 0000000000000028 CR3: 00000001b1ee0000 CR4: 00000000001506f0
       kernel: Call Trace:
       kernel:  <TASK>
       kernel:  ? __die_body+0x1a/0x60
       kernel:  ? page_fault_oops+0x16f/0x4a0
       kernel:  ? search_bpf_extables+0x65/0x70
       kernel:  ? fixup_exception+0x22/0x310
       kernel:  ? exc_page_fault+0x69/0x150
       kernel:  ? asm_exc_page_fault+0x22/0x30
       kernel:  ? __pfx_hugetlbfs_fill_super+0x10/0x10
       kernel:  ? hugetlbfs_fill_super+0xb4/0x1a0
       kernel:  ? hugetlbfs_fill_super+0x28/0x1a0
       kernel:  ? __pfx_hugetlbfs_fill_super+0x10/0x10
       kernel:  vfs_get_super+0x40/0xa0
       kernel:  ? __pfx_bpf_lsm_capable+0x10/0x10
       kernel:  vfs_get_tree+0x25/0xd0
       kernel:  vfs_cmd_create+0x64/0xe0
       kernel:  __x64_sys_fsconfig+0x395/0x410
       kernel:  do_syscall_64+0x80/0x160
       kernel:  ? syscall_exit_to_user_mode+0x82/0x240
       kernel:  ? do_syscall_64+0x8d/0x160
       kernel:  ? syscall_exit_to_user_mode+0x82/0x240
       kernel:  ? do_syscall_64+0x8d/0x160
       kernel:  ? exc_page_fault+0x69/0x150
       kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0x76
       kernel: RIP: 0033:0x7ffbc0cb87c9
       kernel: Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 96 0d 00 f7 d8 64 89 01 48
       kernel: RSP: 002b:00007ffc29d2f388 EFLAGS: 00000206 ORIG_RAX: 00000000000001af
       kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffbc0cb87c9
       kernel: RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003
       kernel: RBP: 00007ffc29d2f3b0 R08: 0000000000000000 R09: 0000000000000000
       kernel: R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
       kernel: R13: 00007ffc29d2f4c0 R14: 0000000000000000 R15: 0000000000000000
       kernel:  </TASK>
       kernel: Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) netfs(E) af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) intel_rapl_msr(E) intel_rapl_common(E) iTCO_wdt(E) intel_pmc_bxt(E) sb_edac(E) iTCO_vendor_support(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) rfkill(E) ipmi_ssif(E) kvm(E) acpi_ipmi(E) irqbypass(E) pcspkr(E) igb(E) ipmi_si(E) mei_me(E) i2c_i801(E) joydev(E) intel_pch_thermal(E) i2c_smbus(E) dca(E) lpc_ich(E) mei(E) ipmi_devintf(E) ipmi_msghandler(E) acpi_pad(E) tiny_power_button(E) button(E) fuse(E) efi_pstore(E) configfs(E) ip_tables(E) x_tables(E) ext4(E) mbcache(E) jbd2(E) hid_generic(E) usbhid(E) sd_mod(E) t10_pi(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) polyval_clmulni(E) ahci(E) xhci_pci(E) polyval_generic(E) gf128mul(E) ghash_clmulni_intel(E) sha512_ssse3(E) sha256_ssse3(E) xhci_pci_renesas(E) libahci(E) ehci_pci(E) sha1_ssse3(E) xhci_hcd(E) ehci_hcd(E) libata(E)
       kernel:  mgag200(E) i2c_algo_bit(E) usbcore(E) wmi(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) scsi_common(E) aesni_intel(E) crypto_simd(E) cryptd(E)
       kernel: Unloaded tainted modules: acpi_cpufreq(E):1 fjes(E):1
       kernel: CR2: 0000000000000028
       kernel: ---[ end trace 0000000000000000 ]---
       kernel: RIP: 0010:hugetlbfs_fill_super+0xb4/0x1a0
       kernel: Code: 48 8b 3b e8 3e c6 ed ff 48 85 c0 48 89 45 20 0f 84 d6 00 00 00 48 b8 ff ff ff ff ff ff ff 7f 4c 89 e7 49 89 44 24 20 48 8b 03 <8b> 48 28 b8 00 10 00 00 48 d3 e0 49 89 44 24 18 48 8b 03 8b 40 28
       kernel: RSP: 0018:ffffbe9960fcbd48 EFLAGS: 00010246
       kernel: RAX: 0000000000000000 RBX: ffff9af5272ae780 RCX: 0000000000372004
       kernel: RDX: ffffffffffffffff RSI: ffffffffffffffff RDI: ffff9af555e9b000
       kernel: RBP: ffff9af52ee66b00 R08: 0000000000000040 R09: 0000000000370004
       kernel: R10: ffffbe9960fcbd48 R11: 0000000000000040 R12: ffff9af555e9b000
       kernel: R13: ffffffffa66b86c0 R14: ffff9af507d2f400 R15: ffff9af507d2f400
       kernel: FS:  00007ffbc0ba4740(0000) GS:ffff9b0bd7000000(0000) knlGS:0000000000000000
       kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       kernel: CR2: 0000000000000028 CR3: 00000001b1ee0000 CR4: 00000000001506f0
      
      Link: https://lkml.kernel.org/r/20240130210418.3771-1-osalvador@suse.de
      Fixes: 32021982
      
       ("hugetlbfs: Convert to fs_context")
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarOscar Salvador <osalvador@suse.de>
      Acked-by: default avatarMuchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      79d72c68
    • John Moon's avatar
      mailmap: switch email address for John Moon · f2076032
      John Moon authored
      
      
      Add current email address as QUIC email is no longer active.
      
      Link: https://lkml.kernel.org/r/20240131034311.46706-1-john@jmoon.dev
      Signed-off-by: default avatarJohn Moon <john@jmoon.dev>
      Acked-by: default avatarTrilok Soni <quic_tsoni@quicinc.com>
      Cc: Elliot Berman <quic_eberman@quicinc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f2076032
    • Johannes Weiner's avatar
      mm: zswap: fix objcg use-after-free in entry destruction · 2e601e1e
      Johannes Weiner authored
      In the per-memcg LRU universe, LRU removal uses entry->objcg to determine
      which list count needs to be decreased.  Drop the objcg reference after
      updating the LRU, to fix a possible use-after-free.
      
      Link: https://lkml.kernel.org/r/20240130013438.565167-1-hannes@cmpxchg.org
      Fixes: a65b0e76
      
       ("zswap: make shrinking memcg-aware")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarYosry Ahmed <yosryahmed@google.com>
      Reviewed-by: default avatarNhat Pham <nphamcs@gmail.com>
      Reviewed-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2e601e1e
    • Sergey Senozhatsky's avatar
      mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range() · 4c2da318
      Sergey Senozhatsky authored
      We need to leave lazy MMU mode before unlocking.
      
      Link: https://lkml.kernel.org/r/20240126032608.355899-1-senozhatsky@chromium.org
      Fixes: b2f557a2
      
       ("mm/madvise: add cond_resched() in madvise_cold_or_pageout_pte_range()")
      Signed-off-by: default avatarSergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Jiexun Wang <wangjiexun@tinylab.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4c2da318
    • Suren Baghdasaryan's avatar
      arch/arm/mm: fix major fault accounting when retrying under per-VMA lock · e870920b
      Suren Baghdasaryan authored
      The change [1] missed ARM architecture when fixing major fault accounting
      for page fault retry under per-VMA lock.
      
      The user-visible effects is that it restores correct major fault
      accounting that was broken after [2] was merged in 6.7 kernel. The
      more detailed description is in [3] and this patch simply adds the
      same fix to ARM architecture which I missed in [3].
      
      Add missing code to fix ARM architecture fault accounting.
      
      [1] 46e714c7 ("arch/mm/fault: fix major fault accounting when retrying under per-VMA lock")
      [2] https://lore.kernel.org/all/20231006195318.4087158-6-willy@infradead.org/
      [3] https://lore.kernel.org/all/20231226214610.109282-1-surenb@google.com/
      
      Link: https://lkml.kernel.org/r/20240123064305.2829244-1-surenb@google.com
      Fixes: 12214eba
      
       ("mm: handle read faults under the VMA lock")
      Reported-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e870920b
    • Muhammad Usama Anjum's avatar
      selftests: core: include linux/close_range.h for CLOSE_RANGE_* macros · 01c1484a
      Muhammad Usama Anjum authored
      Correct header file is needed for getting CLOSE_RANGE_* macros. 
      Previously it was tested with newer glibc which didn't show the need to
      include the header which was a mistake.
      
      Link: https://lkml.kernel.org/r/20231024155137.219700-1-usama.anjum@collabora.com
      Fixes: ec544249
      
       ("selftests: core: remove duplicate defines")
      Reported-by: default avatarAishwarya TCV <aishwarya.tcv@arm.com>
      Link: https://lore.kernel.org/all/7161219e-0223-d699-d6f3-81abd9abf13b@arm.com
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      01c1484a
    • Miaohe Lin's avatar
      mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page · 2fde9e7f
      Miaohe Lin authored
      
      
      When I did soft offline stress test, a machine was observed to crash with
      the following message:
      
        kernel BUG at include/linux/memcontrol.h:554!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
        CPU: 5 PID: 3837 Comm: hwpoison.sh Not tainted 6.7.0-next-20240112-00001-g8ecf3e7fb7c8-dirty #97
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
        RIP: 0010:folio_memcg+0xaf/0xd0
        Code: 10 5b 5d c3 cc cc cc cc 48 c7 c6 08 b1 f2 b2 48 89 ef e8 b4 c5 f8 ff 90 0f 0b 48 c7 c6 d0 b0 f2 b2 48 89 ef e8 a2 c5 f8 ff 90 <0f> 0b 48 c7 c6 08 b1 f2 b2 48 89 ef e8 90 c5 f8 ff 90 0f 0b 66 66
        RSP: 0018:ffffb6c043657c98 EFLAGS: 00000296
        RAX: 000000000000004b RBX: ffff932bc1d1e401 RCX: ffff933abfb5c908
        RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff933abfb5c900
        RBP: ffffea6f04019080 R08: ffffffffb3338ce8 R09: 0000000000009ffb
        R10: 00000000000004dd R11: ffffffffb3308d00 R12: ffffea6f04019080
        R13: ffffea6f04019080 R14: 0000000000000001 R15: ffffb6c043657da0
        FS:  00007f6c60f6b740(0000) GS:ffff933abfb40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000559c3bc8b980 CR3: 0000000107f1c000 CR4: 00000000000006f0
        Call Trace:
         <TASK>
         split_huge_page_to_list+0x4d/0x1380
         try_to_split_thp_page+0x3a/0xf0
         soft_offline_page+0x1ea/0x8a0
         soft_offline_page_store+0x52/0x90
         kernfs_fop_write_iter+0x118/0x1b0
         vfs_write+0x30b/0x430
         ksys_write+0x5e/0xe0
         do_syscall_64+0xb0/0x1b0
         entry_SYSCALL_64_after_hwframe+0x6d/0x75
        RIP: 0033:0x7f6c60d14697
        Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
        RSP: 002b:00007ffe9b72b8d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f6c60d14697
        RDX: 000000000000000c RSI: 0000559c3bc8b980 RDI: 0000000000000001
        RBP: 0000559c3bc8b980 R08: 00007f6c60dd1460 R09: 000000007fffffff
        R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
        R13: 00007f6c60e1a780 R14: 00007f6c60e16600 R15: 00007f6c60e15a00
      
      The problem is that page->mapping is overloaded with slab->slab_list or
      slabs fields now, so slab pages could be taken as non-LRU movable pages if
      field slabs contains PAGE_MAPPING_MOVABLE or slab_list->prev is set to
      LIST_POISON2.  These slab pages will be treated as thp later leading to
      crash in split_huge_page_to_list().
      
      Link: https://lkml.kernel.org/r/20240126065837.2100184-1-linmiaohe@huawei.com
      Link: https://lkml.kernel.org/r/20240124084014.1772906-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Fixes: 130d4df5
      
       ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head")
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2fde9e7f
    • Yosry Ahmed's avatar
      mm: memcg: optimize parent iteration in memcg_rstat_updated() · 9cee7e8e
      Yosry Ahmed authored
      
      
      In memcg_rstat_updated(), we iterate the memcg being updated and its
      parents to update memcg->vmstats_percpu->stats_updates in the fast path
      (i.e. no atomic updates). According to my math, this is 3 memory loads
      (and potentially 3 cache misses) per memcg:
      - Load the address of memcg->vmstats_percpu.
      - Load vmstats_percpu->stats_updates (based on some percpu calculation).
      - Load the address of the parent memcg.
      
      Avoid most of the cache misses by caching a pointer from each struct
      memcg_vmstats_percpu to its parent on the corresponding CPU. In this
      case, for the first memcg we have 2 memory loads (same as above):
      - Load the address of memcg->vmstats_percpu.
      - Load vmstats_percpu->stats_updates (based on some percpu calculation).
      
      Then for each additional memcg, we need a single load to get the
      parent's stats_updates directly. This reduces the number of loads from
      O(3N) to O(2+N) -- where N is the number of memcgs we need to iterate.
      
      Additionally, stash a pointer to memcg->vmstats in each struct
      memcg_vmstats_percpu such that we can access the atomic counter that all
      CPUs fold into, memcg->vmstats->stats_updates.
      memcg_should_flush_stats() is changed to memcg_vmstats_needs_flush() to
      accept a struct memcg_vmstats pointer accordingly.
      
      In struct memcg_vmstats_percpu, make sure both pointers together with
      stats_updates live on the same cacheline. Finally, update
      mem_cgroup_alloc() to take in a parent pointer and initialize the new
      cache pointers on each CPU. The percpu loop in mem_cgroup_alloc() may
      look concerning, but there are multiple similar loops in the cgroup
      creation path (e.g. cgroup_rstat_init()), most of which are hidden
      within alloc_percpu().
      
      According to Oliver's testing [1], this fixes multiple 30-38%
      regressions in vm-scalability, will-it-scale-tlb_flush2, and
      will-it-scale-fallocate1. This comes at a cost of 2 more pointers per
      CPU (<2KB on a machine with 128 CPUs).
      
      [1] https://lore.kernel.org/lkml/ZbDJsfsZt2ITyo61@xsang-OptiPlex-9020/
      
      [yosryahmed@google.com: fix struct memcg_vmstats_percpu size and alignment]
        Link: https://lkml.kernel.org/r/20240203044612.1234216-1-yosryahmed@google.com
      Link: https://lkml.kernel.org/r/20240124100023.660032-1-yosryahmed@google.com
      Signed-off-by: default avatarYosry Ahmed <yosryahmed@google.com>
      Fixes: 8d59d221
      
       ("mm: memcg: make stats flushing threshold per-memcg")
      Tested-by: default avatarkernel test robot <oliver.sang@intel.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202401221624.cb53a8ca-oliver.sang@intel.com
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Greg Thelen <gthelen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9cee7e8e
    • Ryusuke Konishi's avatar
      nilfs2: fix data corruption in dsync block recovery for small block sizes · 67b8bcba
      Ryusuke Konishi authored
      
      
      The helper function nilfs_recovery_copy_block() of
      nilfs_recovery_dsync_blocks(), which recovers data from logs created by
      data sync writes during a mount after an unclean shutdown, incorrectly
      calculates the on-page offset when copying repair data to the file's page
      cache.  In environments where the block size is smaller than the page
      size, this flaw can cause data corruption and leak uninitialized memory
      bytes during the recovery process.
      
      Fix these issues by correcting this byte offset calculation on the page.
      
      Link: https://lkml.kernel.org/r/20240124121936.10575-1-konishi.ryusuke@gmail.com
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      67b8bcba
    • Ryan Roberts's avatar
      mm/userfaultfd: UFFDIO_MOVE implementation should use ptep_get() · 56ae10cf
      Ryan Roberts authored
      Commit c33c7948 ("mm: ptep_get() conversion") converted all (non-arch)
      call sites to use ptep_get() instead of doing a direct dereference of the
      pte.  Full rationale can be found in that commit's log.
      
      Since then, UFFDIO_MOVE has been implemented which does 7 direct pte
      dereferences.  Let's fix those up to use ptep_get().
      
      I've asserted in the past that there is no reliable automated mechanism to
      catch these; I'm relying on a combination of Coccinelle (which throws up a
      lot of false positives) and some compiler magic to force a compiler error
      on dereference.  But given the frequency with which new issues are coming
      up, I'll add it to my todo list to try to find an automated solution.
      
      Link: https://lkml.kernel.org/r/20240123141755.3836179-1-ryan.roberts@arm.com
      Fixes: adef4406
      
       ("userfaultfd: UFFDIO_MOVE uABI")
      Signed-off-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Reviewed-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      56ae10cf
    • Oleg Nesterov's avatar
      exit: wait_task_zombie: kill the no longer necessary spin_lock_irq(siglock) · c1be35a1
      Oleg Nesterov authored
      
      
      After the recent changes nobody use siglock to read the values protected
      by stats_lock, we can kill spin_lock_irq(&current->sighand->siglock) and
      update the comment.
      
      With this patch only __exit_signal() and thread_group_start_cputime() take
      stats_lock under siglock.
      
      Link: https://lkml.kernel.org/r/20240123153359.GA21866@redhat.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c1be35a1
    • Oleg Nesterov's avatar
      fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats · 7601df80
      Oleg Nesterov authored
      
      
      lock_task_sighand() can trigger a hard lockup.  If NR_CPUS threads call
      do_task_stat() at the same time and the process has NR_THREADS, it will
      spin with irqs disabled O(NR_CPUS * NR_THREADS) time.
      
      Change do_task_stat() to use sig->stats_lock to gather the statistics
      outside of ->siglock protected section, in the likely case this code will
      run lockless.
      
      Link: https://lkml.kernel.org/r/20240123153357.GA21857@redhat.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7601df80
    • Oleg Nesterov's avatar
      fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand() · 60f92acb
      Oleg Nesterov authored
      
      
      Patch series "fs/proc: do_task_stat: use sig->stats_".
      
      do_task_stat() has the same problem as getrusage() had before "getrusage:
      use sig->stats_lock rather than lock_task_sighand()": a hard lockup.  If
      NR_CPUS threads call lock_task_sighand() at the same time and the process
      has NR_THREADS, spin_lock_irq will spin with irqs disabled O(NR_CPUS *
      NR_THREADS) time.
      
      
      This patch (of 3):
      
      thread_group_cputime() does its own locking, we can safely shift
      thread_group_cputime_adjusted() which does another for_each_thread loop
      outside of ->siglock protected section.
      
      Not only this removes for_each_thread() from the critical section with
      irqs disabled, this removes another case when stats_lock is taken with
      siglock held.  We want to remove this dependency, then we can change the
      users of stats_lock to not disable irqs.
      
      Link: https://lkml.kernel.org/r/20240123153313.GA21832@redhat.com
      Link: https://lkml.kernel.org/r/20240123153355.GA21854@redhat.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      60f92acb
    • Oleg Nesterov's avatar
      getrusage: use sig->stats_lock rather than lock_task_sighand() · f7ec1cd5
      Oleg Nesterov authored
      
      
      lock_task_sighand() can trigger a hard lockup. If NR_CPUS threads call
      getrusage() at the same time and the process has NR_THREADS, spin_lock_irq
      will spin with irqs disabled O(NR_CPUS * NR_THREADS) time.
      
      Change getrusage() to use sig->stats_lock, it was specifically designed
      for this type of use. This way it runs lockless in the likely case.
      
      TODO:
      	- Change do_task_stat() to use sig->stats_lock too, then we can
      	  remove spin_lock_irq(siglock) in wait_task_zombie().
      
      	- Turn sig->stats_lock into seqcount_rwlock_t, this way the
      	  readers in the slow mode won't exclude each other. See
      	  https://lore.kernel.org/all/20230913154907.GA26210@redhat.com/
      
      	- stats_lock has to disable irqs because ->siglock can be taken
      	  in irq context, it would be very nice to change __exit_signal()
      	  to avoid the siglock->stats_lock dependency.
      
      Link: https://lkml.kernel.org/r/20240122155053.GA26214@redhat.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Tested-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f7ec1cd5
    • Oleg Nesterov's avatar
      getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand() · daa694e4
      Oleg Nesterov authored
      
      
      Patch series "getrusage: use sig->stats_lock", v2.
      
      
      This patch (of 2):
      
      thread_group_cputime() does its own locking, we can safely shift
      thread_group_cputime_adjusted() which does another for_each_thread loop
      outside of ->siglock protected section.
      
      This is also preparation for the next patch which changes getrusage() to
      use stats_lock instead of siglock, thread_group_cputime() takes the same
      lock.  With the current implementation recursive read_seqbegin_or_lock()
      is fine, thread_group_cputime() can't enter the slow mode if the caller
      holds stats_lock, yet this looks more safe and better performance-wise.
      
      Link: https://lkml.kernel.org/r/20240122155023.GA26169@redhat.com
      Link: https://lkml.kernel.org/r/20240122155050.GA26205@redhat.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Tested-by: default avatarDylan Hatch <dylanbhatch@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      daa694e4
    • Prakash Sangappa's avatar
      mm: hugetlb pages should not be reserved by shmat() if SHM_NORESERVE · e656c7a9
      Prakash Sangappa authored
      
      
      For shared memory of type SHM_HUGETLB, hugetlb pages are reserved in
      shmget() call.  If SHM_NORESERVE flags is specified then the hugetlb pages
      are not reserved.  However when the shared memory is attached with the
      shmat() call the hugetlb pages are getting reserved incorrectly for
      SHM_HUGETLB shared memory created with SHM_NORESERVE which is a bug.
      
      -------------------------------
      Following test shows the issue.
      
      $cat shmhtb.c
      
      int main()
      {
      	int shmflags = 0660 | IPC_CREAT | SHM_HUGETLB | SHM_NORESERVE;
      	int shmid;
      
      	shmid = shmget(SKEY, SHMSZ, shmflags);
      	if (shmid < 0)
      	{
      		printf("shmat: shmget() failed, %d\n", errno);
      		return 1;
      	}
      	printf("After shmget()\n");
      	system("cat /proc/meminfo | grep -i hugepages_");
      
      	shmat(shmid, NULL, 0);
      	printf("\nAfter shmat()\n");
      	system("cat /proc/meminfo | grep -i hugepages_");
      
      	shmctl(shmid, IPC_RMID, NULL);
      	return 0;
      }
      
       #sysctl -w vm.nr_hugepages=20
       #./shmhtb
      
      After shmget()
      HugePages_Total:      20
      HugePages_Free:       20
      HugePages_Rsvd:        0
      HugePages_Surp:        0
      
      After shmat()
      HugePages_Total:      20
      HugePages_Free:       20
      HugePages_Rsvd:        5 <--
      HugePages_Surp:        0
      --------------------------------
      
      Fix is to ensure that hugetlb pages are not reserved for SHM_HUGETLB shared
      memory in the shmat() call.
      
      Link: https://lkml.kernel.org/r/1706040282-12388-1-git-send-email-prakash.sangappa@oracle.com
      Signed-off-by: default avatarPrakash Sangappa <prakash.sangappa@oracle.com>
      Acked-by: default avatarMuchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e656c7a9
  2. Feb 05, 2024
  3. Feb 04, 2024
    • Linus Torvalds's avatar
      Linux 6.8-rc3 · 54be6c6c
      Linus Torvalds authored
      v6.8-rc3
      54be6c6c
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 3f24fcda
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Miscellaneous bug fixes and cleanups in ext4's multi-block allocator
        and extent handling code"
      
      * tag 'for-linus-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
        ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC map type
        ext4: make ext4_map_blocks() distinguish delalloc only extent
        ext4: add a hole extent entry in cache after punch
        ext4: correct the hole length returned by ext4_map_blocks()
        ext4: convert to exclusive lock while inserting delalloc extents
        ext4: refactor ext4_da_map_blocks()
        ext4: remove 'needed' in trace_ext4_discard_preallocations
        ext4: remove unnecessary parameter "needed" in ext4_discard_preallocations
        ext4: remove unused return value of ext4_mb_release_group_pa
        ext4: remove unused return value of ext4_mb_release_inode_pa
        ext4: remove unused return value of ext4_mb_release
        ext4: remove unused ext4_allocation_context::ac_groups_considered
        ext4: remove unneeded return value of ext4_mb_release_context
        ext4: remove unused parameter ngroup in ext4_mb_choose_next_group_*()
        ext4: remove unused return value of __mb_check_buddy
        ext4: mark the group block bitmap as corrupted before reporting an error
        ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal()
        ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found()
        ext4: avoid dividing by 0 in mb_update_avg_fragment_size() when block bitmap corrupt
        ext4: avoid bb_free and bb_fragments inconsistency in mb_free_blocks()
        ...
      3f24fcda
    • Linus Torvalds's avatar
      Merge tag 'v6.8-rc3-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 9e28c7a2
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Five smb3 client fixes, mostly multichannel related:
      
         - four multichannel fixes including fix for channel allocation when
           multiple inactive channels, fix for unneeded race in channel
           deallocation, correct redundant channel scaling, and redundant
           multichannel disabling scenarios
      
         - add warning if max compound requests reached"
      
      * tag 'v6.8-rc3-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        smb: client: increase number of PDUs allowed in a compound request
        cifs: failure to add channel on iface should bump up weight
        cifs: do not search for channel if server is terminating
        cifs: avoid redundant calls to disable multichannel
        cifs: make sure that channel scaling is done only once
      9e28c7a2
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.8-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · fc86e5c9
      Linus Torvalds authored
      Pull xfs fixes from Chandan Babu:
      
       - Clear XFS_ATTR_INCOMPLETE filter on removing xattr from a node format
         attribute fork
      
       - Remove conditional compilation of realtime geometry validator
         functions to prevent confusing error messages from being printed on
         the console during the mount operation
      
      * tag 'xfs-6.8-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: remove conditional building of rt geometry validator functions
        xfs: reset XFS_ATTR_INCOMPLETE filter on node removal
      fc86e5c9
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 3a0e9220
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are three tiny driver fixes for 6.8-rc3.  They include:
      
         - Android binder long-term bug with epoll finally being fixed
      
         - fastrpc driver shutdown bugfix
      
         - open-dice lockdep fix
      
        All of these have been in linux-next this week with no reported
        issues"
      
      * tag 'char-misc-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        binder: signal epoll threads of self-work
        misc: open-dice: Fix spurious lockdep warning
        misc: fastrpc: Mark all sessions as invalid in cb_remove
      3a0e9220
    • Linus Torvalds's avatar
      Merge tag 'tty-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 02149609
      Linus Torvalds authored
      Pull tty and serial driver fixes from Greg KH:
       "Here are some small tty and serial driver fixes for 6.8-rc3 that
        resolve a number of reported issues. Included in here are:
      
         - rs485 flag definition fix that affected the user/kernel abi in -rc1
      
         - max310x driver fixes
      
         - 8250_pci1xxxx driver off-by-one fix
      
         - uart_tiocmget locking race fix
      
        All of these have been in linux-next for over a week with no reported
        issues"
      
      * tag 'tty-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: max310x: prevent infinite while() loop in port startup
        serial: max310x: fail probe if clock crystal is unstable
        serial: max310x: improve crystal stable clock detection
        serial: max310x: set default value when reading clock ready bit
        serial: core: Fix atomicity violation in uart_tiocmget
        serial: 8250_pci1xxxx: fix off by one in pci1xxxx_process_read_data()
        tty: serial: Fix bit order in RS485 flag definitions
      02149609
    • Linus Torvalds's avatar
      Merge tag 'usb-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 809be620
      Linus Torvalds authored
      Pull USB driver fixes from Greg KH:
       "Here are a bunch of small USB driver fixes for 6.8-rc3. Included in
        here are:
      
         - new usb-serial driver ids
      
         - new dwc3 driver id added
      
         - typec driver change revert
      
         - ncm gadget driver endian bugfix
      
         - xhci bugfixes for a number of reported issues
      
         - usb hub bugfix for alternate settings
      
         - ulpi driver debugfs memory leak fix
      
         - chipidea driver bugfix
      
         - usb gadget driver fixes
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'usb-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits)
        USB: serial: option: add Fibocom FM101-GL variant
        USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e
        USB: serial: cp210x: add ID for IMST iM871A-USB
        usb: typec: tcpm: fix the PD disabled case
        usb: ucsi_acpi: Quirk to ack a connector change ack cmd
        usb: ucsi_acpi: Fix command completion handling
        usb: ucsi: Add missing ppm_lock
        usb: ulpi: Fix debugfs directory leak
        Revert "usb: typec: tcpm: fix cc role at port reset"
        usb: gadget: pch_udc: fix an Excess kernel-doc warning
        usb: f_mass_storage: forbid async queue when shutdown happen
        USB: hub: check for alternate port before enabling A_ALT_HNP_SUPPORT
        usb: chipidea: core: handle power lost in workqueue
        usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend
        usb: dwc3: pci: add support for the Intel Arrow Lake-H
        usb: core: Prevent null pointer dereference in update_port_device_state
        xhci: handle isoc Babble and Buffer Overrun events properly
        xhci: process isoc TD properly when there was a transaction error mid TD.
        xhci: fix off by one check when adding a secondary interrupter.
        xhci: fix possible null pointer dereference at secondary interrupter removal
        ...
      809be620
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · bdda52cc
      Linus Torvalds authored
      Pull i2c fixlet from Wolfram Sang:
       "MAINTAINERS update to point people to the new tree for i2c host driver
        changes"
      
      * tag 'i2c-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        MAINTAINERS: Update i2c host drivers repository
      bdda52cc
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 8a0c60a0
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "Core:
      
         - fix return value of is_slave_direction() for D2D dma
      
        Driver fixes for:
      
         - Documentaion fixes to resolve warnings for at_hdmac driver
      
         - bunch of fsl driver fixes for memory leaks, and useless kfree
      
         - TI edma and k3 fixes for packet error and null pointer checks"
      
      * tag 'dmaengine-fix-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: at_hdmac: add missing kernel-doc style description
        dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV
        dmaengine: fsl-qdma: Remove a useless devm_kfree()
        dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA
        dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA
        dmaengine: ti: k3-udma: Report short packet errors
        dmaengine: ti: edma: Add some null pointer checks to the edma_probe
        dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools
        dmaengine: at_hdmac: fix some kernel-doc warnings
      8a0c60a0
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · 843a33d6
      Linus Torvalds authored
      Pull phy driver fixes from Vinod Koul:
      
       - TI null pointer dereference
      
       - missing erdes mux entry in lan966x driver
      
       - Return of error code in renesas driver
      
       - Serdes init sequence and register offsets for IPQ drivers
      
      * tag 'phy-fixes-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP
        phy: lan966x: Add missing serdes mux entry
        phy: renesas: rcar-gen3-usb2: Fix returning wrong error code
        phy: qcom-qmp-usb: fix serdes init sequence for IPQ6018
        phy: qcom-qmp-usb: fix register offsets for ipq8074/ipq6018
      843a33d6
    • Wolfram Sang's avatar
      Merge tag 'i2c-host-fixes-6.8-rc3' of... · 957bd221
      Wolfram Sang authored
      Merge tag 'i2c-host-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
      
      Just a maintenance patch that updates the repository where the
      i2c host and muxes related patches will be collected.
      957bd221
  4. Feb 03, 2024
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.8-1-2024-02-01' of... · b555d191
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.8-1-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
       "Vendor events:
      
         - Intel Alderlake/Sapphire Rapids metric fixes, the CPU type
           ("cpu_atom", "cpu_core") needs to be used as a prefix to be
           considered on a metric formula, detected via one of the 'perf test'
           entries.
      
        'perf test' fixes:
      
         - Fix the creation of event selector lists on 'perf test' entries, by
           initializing the sample ID flag, which is done by 'perf record', so
           this fix affects only the tests, the common case isn't affected
      
         - Make 'perf list' respect debug settings (-v) to fix its 'perf test'
           entry
      
         - Fix 'perf script' test when python support isn't enabled
      
         - Special case 'perf script' tests on s390, where only DWARF call
           graphs are supported and only on software events
      
         - Make 'perf daemon' signal test less racy
      
        Compiler warnings/errors:
      
         - Remove needless malloc(0) call in 'perf top' that triggers
           -Walloc-size
      
         - Fix calloc() argument order to address error introduced in gcc-14
      
        Build:
      
         - Make minimal shellcheck version to v0.6.0, avoiding the build to
           fail with older versions
      
        Sync kernel header copies:
      
         - stat.h to pick STATX_MNT_ID_UNIQUE
      
         - msr-index.h to pick IA32_MKTME_KEYID_PARTITIONING
      
         - drm.h to pick DRM_IOCTL_MODE_CLOSEFB
      
         - unistd.h to pick {list,stat}mount,
           lsm_{[gs]et_self_attr,list_modules} syscall numbers
      
         - x86 cpufeatures to pick TDX, Zen, APIC MSR fence changes
      
         - x86's mem{cpy,set}_64.S used in 'perf bench'
      
         - Also, without tooling effects: asm-generic/unaligned.h, mount.h,
           fcntl.h, kvm headers"
      
      * tag 'perf-tools-fixes-for-v6.8-1-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (21 commits)
        perf tools headers: update the asm-generic/unaligned.h copy with the kernel sources
        tools include UAPI: Sync linux/mount.h copy with the kernel sources
        perf evlist: Fix evlist__new_default() for > 1 core PMU
        tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
        tools headers x86 cpufeatures: Sync with the kernel sources to pick TDX, Zen, APIC MSR fence changes
        tools headers UAPI: Sync unistd.h to pick {list,stat}mount, lsm_{[gs]et_self_attr,list_modules} syscall numbers
        perf vendor events intel: Alderlake/sapphirerapids metric fixes
        tools headers UAPI: Sync kvm headers with the kernel sources
        perf tools: Fix calloc() arguments to address error introduced in gcc-14
        perf top: Remove needless malloc(0) call that triggers -Walloc-size
        perf build: Make minimal shellcheck version to v0.6.0
        tools headers UAPI: Update tools's copy of drm.h headers to pick DRM_IOCTL_MODE_CLOSEFB
        perf test shell daemon: Make signal test less racy
        perf test shell script: Fix test for python being disabled
        perf test: Workaround debug output in list test
        perf list: Add output file option
        perf list: Switch error message to pr_err() to respect debug settings (-v)
        perf test: Fix 'perf script' tests on s390
        tools headers UAPI: Sync linux/fcntl.h with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources to pick IA32_MKTME_KEYID_PARTITIONING
        ...
      b555d191
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 56897d51
      Linus Torvalds authored
      Pull tracing and eventfs fixes from Steven Rostedt:
      
       - Fix the return code for ring_buffer_poll_wait()
      
         It was returing a -EINVAL instead of EPOLLERR.
      
       - Zero out the tracefs_inode so that all fields are initialized.
      
         The ti->private could have had stale data, but instead of just
         initializing it to NULL, clear out the entire structure when it is
         allocated.
      
       - Fix a crash in timerlat
      
         The hrtimer was initialized at read and not open, but is canceled at
         close. If the file was opened and never read the close will pass a
         NULL pointer to hrtime_cancel().
      
       - Rewrite of eventfs.
      
         Linus wrote a patch series to remove the dentry references in the
         eventfs_inode and to use ref counting and more of proper VFS
         interfaces to make it work.
      
       - Add warning to put_ei() if ei is not set to free. That means
         something is about to free it when it shouldn't.
      
       - Restructure the eventfs_inode to make it more compact, and remove the
         unused llist field.
      
       - Remove the fsnotify*() funtions for when the inodes were being
         created in the lookup code. It doesn't make sense to notify about
         creation just because something is being looked up.
      
       - The inode hard link count was not accurate.
      
         It was being updated when a file was looked up. The inodes of
         directories were updating their parent inode hard link count every
         time the inode was created. That means if memory reclaim cleaned a
         stale directory inode and the inode was lookup up again, it would
         increment the parent inode again as well. Al Viro said to just have
         all eventfs directories have a hard link count of 1. That tells user
         space not to trust it.
      
      * tag 'trace-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        eventfs: Keep all directory links at 1
        eventfs: Remove fsnotify*() functions from lookup()
        eventfs: Restructure eventfs_inode structure to be more condensed
        eventfs: Warn if an eventfs_inode is freed without is_freed being set
        tracing/timerlat: Move hrtimer_init to timerlat_fd open()
        eventfs: Get rid of dentry pointers without refcounts
        eventfs: Clean up dentry ops and add revalidate function
        eventfs: Remove unused d_parent pointer field
        tracefs: dentry lookup crapectomy
        tracefs: Avoid using the ei->dentry pointer unnecessarily
        eventfs: Initialize the tracefs inode properly
        tracefs: Zero out the tracefs_inode when allocating it
        ring-buffer: Clean ring_buffer_poll_wait() error return
      56897d51
    • Linus Torvalds's avatar
      Merge tag 'gfs2-v6.8-rc2-revert' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 6b89b6af
      Linus Torvalds authored
      Pull gfs2 revert from Andreas Gruenbacher:
       "It turns out that the commit to use GL_NOBLOCK flag for non-blocking
        lookups has several issues, and not all of them have a simple fix"
      
      * tag 'gfs2-v6.8-rc2-revert' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        Revert "gfs2: Use GL_NOBLOCK flag for non-blocking lookups"
      6b89b6af
    • Linus Torvalds's avatar
      Merge tag 'pci-v6.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci · b1dd6c26
      Linus Torvalds authored
      Pull pci fixes from Bjorn Helgaas:
      
       - Fix a potential deadlock that was reintroduced by an ASPM revert
         merged for v6.8 (Johan Hovold)
      
       - Add Manivannan Sadhasivam as PCI Endpoint maintainer (Lorenzo
         Pieralisi)
      
      * tag 'pci-v6.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
        MAINTAINERS: Add Manivannan Sadhasivam as PCI Endpoint maintainer
        PCI/ASPM: Fix deadlock when enabling ASPM
      b1dd6c26
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2024-02-03' of git://anongit.freedesktop.org/drm/drm · 9c2f0338
      Linus Torvalds authored
      Pul drm fixes from Dave Airlie:
       "Regular weekly fixes, mostly amdgpu and xe. One nouveau fix is a
        better fix for the deadlock and also helps with a sync race we were
        seeing.
      
        dma-buf:
         - heaps CMA page accounting fix
      
        virtio-gpu:
         - fix segment size
      
        xe:
         - A crash fix
         - A fix for an assert due to missing mem_acces ref
         - Only allow a single user-fence per exec / bind.
         - Some sparse warning fixes
         - Two fixes for compilation failures on various odd combinations of
           gcc / arch pointed out on LKML.
         - Fix a fragile partial allocation pointed out on LKML.
         - A sysfs ABI documentation warning fix
      
        amdgpu:
         - Fix reboot issue seen on some 7000 series dGPUs
         - Fix client init order for KFD
         - Misc display fixes
         - USB-C fix
         - DCN 3.5 fixes
         - Fix issues with GPU scheduler and GPU reset
         - GPU firmware loading fix
         - Misc fixes
         - GC 11.5 fix
         - VCN 4.0.5 fix
         - IH overflow fix
      
        amdkfd:
         - SVM fixes
         - Trap handler fix
         - Fix device permission lookup
         - Properly reserve BO before validating it
      
        nouveau:
         - fence/irq lock deadlock fix (second attempt)
         - gsp command size fix
      
      * tag 'drm-fixes-2024-02-03' of git://anongit.freedesktop.org/drm/drm: (35 commits)
        nouveau: offload fence uevents work to workqueue
        nouveau/gsp: use correct size for registry rpc.
        drm/amdgpu/pm: Use inline function for IP version check
        drm/hwmon: Fix abi doc warnings
        drm/xe: Make all GuC ABI shift values unsigned
        drm/xe/vm: Subclass userptr vmas
        drm/xe: Use LRC prefix rather than CTX prefix in lrc desc defines
        drm/xe: Don't use __user error pointers
        drm/xe: Annotate mcr_[un]lock()
        drm/xe: Only allow 1 ufence per exec / bind IOCTL
        drm/xe: Grab mem_access when disabling C6 on skip_guc_pc platforms
        drm/xe: Fix crash in trace_dma_fence_init()
        drm/amdgpu: Reset IH OVERFLOW_CLEAR bit
        drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend
        drm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0
        drm/amdkfd: reserve the BO before validating it
        drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'
        drm/amd/display: Fix buffer overflow in 'get_host_router_total_dp_tunnel_bw()'
        drm/amd/display: Add NULL check for kzalloc in 'amdgpu_dm_atomic_commit_tail()'
        drm/amd: Don't init MEC2 firmware when it fails to load
        ...
      9c2f0338
    • Linus Torvalds's avatar
      Merge tag 'input-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · eab5c86d
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
      
       - a fix for the fix to deal with newer laptops which get confused by
         the "GET ID" command when probing for PS/2 keyboards
      
       - a couple of tweaks to i8042 to handle Clevo NS70PU and Lifebook U728
         laptops
      
       - a change to bcm5974 to validate that the device has appropriate
         endpoints
      
       - an addition of new product ID to xpad driver to recognize Lenovo
         Legion Go controllers
      
       - a quirk to Goodix controller to deal with extra GPIO described in
         ACPI tables on some devices.
      
      * tag 'input-for-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: i8042 - add Fujitsu Lifebook U728 to i8042 quirk table
        Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU
        Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID
        Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID
        Input: bcm5974 - check endpoint type before starting traffic
        Input: xpad - add Lenovo Legion Go controllers
        Input: goodix - accept ACPI resources with gpio_count == 3 && gpio_int_idx == 0
      eab5c86d
    • Linus Torvalds's avatar
      Merge tag 'sound-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 01370ceb
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of fixes, mostly device-specific ones:
      
         - Minor PCM core fix for name strings
      
         - ASoC Qualcomm fixes, including DAI support extensions
      
         - ASoC AMD platform updates
      
         - ASoC Allwinner platform updates
      
         - Various ASoC codec fixes for WSA, WCD, ES8326 drivers
      
         - Various HD-audio and USB-audio fixes and quirks
      
         - A series of fixes for Cirrus CS35L56 codecs"
      
      * tag 'sound-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (63 commits)
        ALSA: usb-audio: Ignore clock selector errors for single connection
        ALSA: hda/realtek: Enable headset mic on Vaio VJFE-ADL
        ALSA: hda: cs35l56: Remove unused test stub function
        ALSA: hda: cs35l56: Firmware file must match the version of preloaded firmware
        ALSA: hda: cs35l56: Fix filename string field layout
        ALSA: hda: cs35l56: Fix order of searching for firmware files
        ASoC: cs35l56: Allow more time for firmware to boot
        ASoC: cs35l56: Load tunings for the correct speaker models
        ASoC: cs35l56: Firmware file must match the version of preloaded firmware
        ASoC: cs35l56: Fix misuse of wm_adsp 'part' string for silicon revision
        ASoC: cs35l56: Fix for initializing ASP1 mixer registers
        ALSA: hda: cs35l56: Initialize all ASP1 registers
        ASoC: cs35l56: Fix default SDW TX mixer registers
        ASoC: cs35l56: Fix to ensure ASP1 registers match cache
        ASoC: cs35l56: Remove buggy checks from cs35l56_is_fw_reload_needed()
        ASoC: cs35l56: Don't add the same register patch multiple times
        ASoC: cs35l56: cs35l56_component_remove() must clean up wm_adsp
        ASoC: cs35l56: cs35l56_component_remove() must clear cs35l56->component
        ASoC: wm_adsp: Don't overwrite fwf_name with the default
        ASoC: wm_adsp: Fix firmware file search order
        ...
      01370ceb
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v6.8-rc3' of... · 43e7ef64
      Linus Torvalds authored
      Merge tag 'hwmon-for-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - pmbus/mp2975: Fix driver initialization
      
       - gigabyte_waterforce: Add missing unlock in error handling path
      
      * tag 'hwmon-for-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (pmbus/mp2975) Correct comment inside 'mp2975_read_byte_data'
        hwmon: (pmbus/mp2975) Fix driver initialization for MP2975 device
        hwmon: gigabyte_waterforce: Fix locking bug in waterforce_get_status()
      43e7ef64
    • Linus Torvalds's avatar
      Merge tag 'for-v6.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply · 79837a7c
      Linus Torvalds authored
      Pull power supply fix from Sebastian Reichel:
      
       - qcom_battmgr: revert broken fix
      
      * tag 'for-v6.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
        Revert "power: supply: qcom_battmgr: Register the power supplies after PDR is up"
      79837a7c
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 4f18d3fd
      Linus Torvalds authored
      Pul iommu fixes from Joerg Roedel:
      
       - Make iommu_ops->default_domain work without CONFIG_IOMMU_DMA to fix
         initialization of FSL-PAMU devices
      
       - Fix for Tegra fbdev initialization failure
      
       - Fix for a VFIO device unbinding failure on PowerPC
      
      * tag 'iommu-fixes-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        powerpc: iommu: Bring back table group release_ownership() call
        drm/tegra: Do not assume that a NULL domain means no DMA IOMMU
        iommu: Allow ops->default_domain to work when !CONFIG_IOMMU_DMA
      4f18d3fd
    • Linus Torvalds's avatar
      Merge tag 'for-6.8/dm-fixes' of... · 6897cea7
      Linus Torvalds authored
      Merge tag 'for-6.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM ioctl interface to avoid INT_MAX overflow warnings from
         kvmalloc by limiting the number of targets and parameter size area.
      
       - Fix DM stats to avoid INT_MAX overflow warnings from kvmalloc by
         limiting the number of entries supported.
      
       - Fix DM writecache to support mapping devices larger than 1 TiB by
         switching from using kvmalloc_array to vmalloc_array -- which avoids
         INT_MAX overflow in kvmalloc_node and associated warnings.
      
       - Remove the (ab)use of tasklets from both the DM crypt and verity
         targets. They will be converted to use BH workqueue in future.
      
      * tag 'for-6.8/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm-crypt, dm-verity: disable tasklets
        dm writecache: allow allocations larger than 2GiB
        dm stats: limit the number of entries
        dm: limit the number of targets and parameter size area
      6897cea7
    • Linus Torvalds's avatar
      Merge tag 'ata-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · 03503275
      Linus Torvalds authored
      Pull ata fix from Niklas Cassel:
      
       - Following up on last week's ASMedia ASM1061 43-bit dma_mask quirk, we
         sent an email to ASMedia developers that have previously been active
         on the mailing list, asking exactly which SATA controllers that are
         affected by this hardware limitation.
      
         We got a reply that it affects all the SATA controllers in the
         ASM106x family, thus extend the existing 43-bit dma_mask quirk to
         apply to all the affected ASMedia SATA controllers.
      
      * tag 'ata-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
        ahci: Extend ASM1061 43-bit DMA address quirk to other ASM106x parts
      03503275