Skip to content
  1. Aug 02, 2023
    • Nicholas Piggin's avatar
      powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs · e43c0a0c
      Nicholas Piggin authored
      
      
      This performs lazy tlb mm shootdown when doing the exit TLB flush when
      all mm users go away and user mappings are removed, which avoids having
      to do the lazy tlb mm shootdown IPIs on the final mmput when all kernel
      references disappear.
      
      powerpc/64s uses a broadcast TLBIE for the exit TLB flush if remote CPUs
      need to be invalidated (unless TLBIE is disabled), so this doesn't
      necessarily save IPIs but it does avoid a broadcast TLBIE which is quite
      expensive.
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      [mpe: Squash in preempt_disable/enable() fix from Nick]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20230524060821.148015-5-npiggin@gmail.com
      e43c0a0c
    • Nicholas Piggin's avatar
      powerpc: Add mm_cpumask warning when context switching · 177255af
      Nicholas Piggin authored
      
      
      When context switching away from an mm, add a CONFIG_DEBUG_VM warning
      check to ensure this CPU is still set in the mask. This could catch
      bugs where the mask is improperly trimmed while the CPU is still using
      the mm.
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20230524060821.148015-4-npiggin@gmail.com
      177255af
    • Nicholas Piggin's avatar
      powerpc/64s: Use dec_mm_active_cpus helper · f74b2a6c
      Nicholas Piggin authored
      
      
      Avoid open-coded atomic_dec on mm->context.active_cpus and use the
      function made for it. Add CONFIG_DEBUG_VM underflow checking on the
      counter.
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20230524060821.148015-3-npiggin@gmail.com
      f74b2a6c
    • Nicholas Piggin's avatar
      powerpc: Account mm_cpumask and active_cpus in init_mm · c3c2e937
      Nicholas Piggin authored
      
      
      init_mm mm_cpumask and context.active_cpus is not maintained at boot
      and hotplug. This seems to be harmless because init_mm does not have a
      userspace and so never gets user TLBs flushed, but it looks odd and it
      prevents some sanity checks being added.
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20230524060821.148015-2-npiggin@gmail.com
      c3c2e937
    • Michael Ellerman's avatar
      powerpc/64: Enable accelerated crypto algorithms in defconfig · ab481817
      Michael Ellerman authored
      
      
      Enable all the acclerated crypto algorithms as modules in the 64-bit
      defconfig, to get more test coverage.
      
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20230717115223.286158-1-mpe@ellerman.id.au
      ab481817
    • Omar Sandoval's avatar
      powerpc/crypto: don't build aes-gcm-p10 by default · 026fa6c5
      Omar Sandoval authored
      
      
      None of the other accelerated crypto modules are built by default.
      
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/40d9c7ebe82c9a9d4ace542ac433753d2f22c6a0.1689007370.git.osandov@osandov.com
      026fa6c5
    • Omar Sandoval's avatar
      powerpc/crypto: fix missing skcipher dependency for aes-gcm-p10 · 9d6e1c21
      Omar Sandoval authored
      
      
      My stripped down configuration fails to build with:
      
        ERROR: modpost: "skcipher_walk_aead_encrypt" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!
        ERROR: modpost: "skcipher_walk_done" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!
        ERROR: modpost: "skcipher_walk_aead_decrypt" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!
      
      Fix it by selecting CRYPTO_SKCIPHER.
      
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/c55ad70799e027a3d2756b85ccadc0af52ae8915.1689007370.git.osandov@osandov.com
      9d6e1c21
    • Christophe Leroy's avatar
      powerpc/kuap: Use ASM feature fixups instead of static branches · 3a24ea0d
      Christophe Leroy authored
      
      
      To avoid a useless nop on top of every uaccess enable/disable and
      make life easier for objtool, replace static branches by ASM feature
      fixups that will nop KUAP enabling instructions out in the unlikely
      case KUAP is disabled at boottime.
      
      Leave it as is on book3s/64 for now, it will be handled later when
      objtool is activated on PPC64.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/671948788024fd890ec4ed175bc332dab8664ea5.1689091022.git.christophe.leroy@csgroup.eu
      3a24ea0d
    • Christophe Leroy's avatar
      powerpc/kuap: KUAP enabling/disabling functions must be __always_inline · eb52f66f
      Christophe Leroy authored
      
      
      Objtool reports following warnings:
      
        arch/powerpc/kernel/signal_32.o: warning: objtool:
          __prevent_user_access.constprop.0+0x4 (.text+0x4):
          redundant UACCESS disable
      
        arch/powerpc/kernel/signal_32.o: warning: objtool: user_access_begin+0x2c
          (.text+0x4c): return with UACCESS enabled
      
        arch/powerpc/kernel/signal_32.o: warning: objtool: handle_rt_signal32+0x188
          (.text+0x360): call to __prevent_user_access.constprop.0() with UACCESS enabled
      
        arch/powerpc/kernel/signal_32.o: warning: objtool: handle_signal32+0x150
          (.text+0x4d4): call to __prevent_user_access.constprop.0() with UACCESS enabled
      
      This is due to some KUAP enabling/disabling functions being outline
      allthough they are marked inline. Use __always_inline instead.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/ca5e50ddbec3867db5146ebddbc9a1dc0e443bc8.1689091022.git.christophe.leroy@csgroup.eu
      eb52f66f
    • Christophe Leroy's avatar
      powerpc/kuap: Simplify KUAP lock/unlock on BOOK3S/32 · 5222a1d5
      Christophe Leroy authored
      
      
      On book3s/32 KUAP is performed at segment level. At the moment,
      when enabling userspace access, only current segment is modified.
      Then if a write is performed on another user segment, a fault is
      taken and all other user segments get enabled for userspace
      access. This then require special attention when disabling
      userspace access.
      
      Having a userspace write access crossing a segment boundary is
      unlikely. Having a userspace write access crossing a segment boundary
      back and forth is even more unlikely. So, instead of enabling
      userspace access on all segments when a write fault occurs, just
      change which segment has userspace access enabled in order to
      eliminate the case when more than one segment has userspace access
      enabled. That simplifies userspace access deactivation.
      
      There is however a corner case which is even more unlikely but has
      to be handled anyway: an unaligned access which is crossing a
      segment boundary. That would definitely require at least having
      userspace access enabled on the two segments. To avoid complicating
      the likely case for a so unlikely happening, handle such situation
      like an alignment exception and emulate the store.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/8de8580513c1a6e880bad1ba9a69d3efad3d4fa5.1689091022.git.christophe.leroy@csgroup.eu
      5222a1d5
    • Christophe Leroy's avatar
      powerpc/kuap: Use MMU_FTR_KUAP on all and refactor disabling kuap · 26e04120
      Christophe Leroy authored
      
      
      All but book3s/64 use a static branch key for disabling kuap.
      book3s/64 uses an mmu feature.
      
      Refactor all targets to use MMU_FTR_KUAP like book3s/64.
      
      For PPC32 that implies updating mmu features fixups once KUAP
      has been initialised.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/6b3d7c977bad73378ea368bc6818e9c94ea95ab0.1689091022.git.christophe.leroy@csgroup.eu
      26e04120
    • Christophe Leroy's avatar
      powerpc/kuap: MMU_FTR_BOOK3S_KUAP becomes MMU_FTR_KUAP · 4589a2b7
      Christophe Leroy authored
      
      
      In order to reuse MMU_FTR_BOOK3S_KUAP for other targets than BOOK3S,
      rename it MMU_FTR_KUAP.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/c8b6f7b8cd0eeaace96879ed0e0a157faa619451.1689091022.git.christophe.leroy@csgroup.eu
      4589a2b7
    • Christophe Leroy's avatar
      powerpc/features: Add capability to update mmu features later · 6b289911
      Christophe Leroy authored
      
      
      On powerpc32, features fixup is performed very early and that's too
      early to read the cmdline and take into account 'nosmap' parameter.
      
      On the other hand, no userspace access is performed that early and
      KUAP feature fixup can be performed later.
      
      Add a function to update mmu features. The function is passed a
      mask with the features that can be updated.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/31b27ee2c9d338f4f82cd8cd69d6bff979495290.1689091022.git.christophe.leroy@csgroup.eu
      6b289911
    • Christophe Leroy's avatar
      powerpc/kuap: Fold kuep_is_disabled() into its only user · 38bb171b
      Christophe Leroy authored
      kuep_is_disabled() was introduced by commit 91bb3082 ("powerpc/32s:
      Refactor update of user segment registers") but then all users but one
      were removed by commit 526d4a4c
      
       ("powerpc/32s: Do kuep_lock() and
      kuep_unlock() in assembly").
      
      Fold kuep_is_disabled() into init_new_context() which is its only user.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/b2247147c0a8c830ac82966451647850df4a64da.1689091022.git.christophe.leroy@csgroup.eu
      38bb171b
    • Christophe Leroy's avatar
      powerpc/kuap: Avoid useless jump_label on empty function · 1bec4adc
      Christophe Leroy authored
      
      
      Disassembly of interrupt_enter_prepare() shows a pointless nop
      before the mftb
      
        c000abf0 <interrupt_enter_prepare>:
        c000abf0:       81 23 00 84     lwz     r9,132(r3)
        c000abf4:       71 29 40 00     andi.   r9,r9,16384
        c000abf8:       41 82 00 28     beq-    c000ac20 <interrupt_enter_prepare+0x30>
        c000abfc: ===>  60 00 00 00     nop	<====
        c000ac00:       7d 0c 42 e6     mftb    r8
        c000ac04:       80 e2 00 08     lwz     r7,8(r2)
        c000ac08:       81 22 00 28     lwz     r9,40(r2)
        c000ac0c:       91 02 00 24     stw     r8,36(r2)
        c000ac10:       7d 29 38 50     subf    r9,r9,r7
        c000ac14:       7d 29 42 14     add     r9,r9,r8
        c000ac18:       91 22 00 08     stw     r9,8(r2)
        c000ac1c:       4e 80 00 20     blr
        c000ac20:       60 00 00 00     nop
        c000ac24:       7d 5a c2 a6     mfmd_ap r10
        c000ac28:       3d 20 de 00     lis     r9,-8704
        c000ac2c:       91 43 00 b0     stw     r10,176(r3)
        c000ac30:       7d 3a c3 a6     mtspr   794,r9
        c000ac34:       4e 80 00 20     blr
      
      That comes from the call to kuap_loc(), allthough __kuap_lock() is an
      empty function on the 8xx.
      
      To avoid that, only perform kuap_is_disabled() check when there is
      something to do with __kuap_lock().
      
      Do the same with __kuap_save_and_lock() and
      __kuap_get_and_assert_locked().
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/a854d25bea375d4ba6ca9c2617f9edbba397100a.1689091022.git.christophe.leroy@csgroup.eu
      1bec4adc
    • Christophe Leroy's avatar
      powerpc/kuap: Avoid unnecessary reads of MD_AP · 880df2d4
      Christophe Leroy authored
      
      
      A disassembly of interrupt_exit_kernel_prepare() shows a useless read
      of MD_AP register. This is shown by r9 being re-used immediately without
      doing anything with the value read.
      
        c000e0e0:       60 00 00 00     nop
        c000e0e4: ===>  7d 3a c2 a6     mfmd_ap r9	<====
        c000e0e8:       7d 20 00 a6     mfmsr   r9
        c000e0ec:       7c 51 13 a6     mtspr   81,r2
        c000e0f0:       81 3f 00 84     lwz     r9,132(r31)
        c000e0f4:       71 29 80 00     andi.   r9,r9,32768
      
      kuap_get_and_assert_locked() is paired with kuap_kernel_restore()
      and are only used in interrupt_exit_kernel_prepare(). The value
      returned by kuap_get_and_assert_locked() is only used by
      kuap_kernel_restore().
      
      On 8xx, kuap_kernel_restore() doesn't use the value read by
      kuap_get_and_assert_locked() so modify kuap_get_and_assert_locked()
      to not perform the read of MD_AP and return 0 instead.
      
      The same applies on BOOKE.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/bcbc84c2dd90bb1021da792b1968cdc22112dad8.1689091022.git.christophe.leroy@csgroup.eu
      880df2d4
  2. Jul 24, 2023
    • Linus Torvalds's avatar
      Linux 6.5-rc3 · 6eaae198
      Linus Torvalds authored
      6eaae198
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 3b4e48b8
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Swapping the ring buffer for snapshotting (for things like irqsoff)
         can crash if the ring buffer is being resized. Disable swapping when
         this happens. The missed swap will be reported to the tracer
      
       - Report error if the histogram fails to be created due to an error in
         adding a histogram variable, in event_hist_trigger_parse()
      
       - Remove unused declaration of tracing_map_set_field_descr()
      
      * tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing/histograms: Return an error if we fail to add histogram to hist_vars list
        ring-buffer: Do not swap cpu_buffer during resize process
        tracing: Remove unused extern declaration tracing_map_set_field_descr()
      3b4e48b8
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.5' of... · 12a5336c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix stale help text in gconfig
      
       - Support *.S files in compile_commands.json
      
       - Flatten KBUILD_CFLAGS
      
       - Fix external module builds with Rust so that temporary files are
         created in the modules directories instead of the kernel tree
      
      * tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: rust: avoid creating temporary files
        kbuild: flatten KBUILD_CFLAGS
        gen_compile_commands: add assembly files to compilation database
        kconfig: gconfig: correct program name in help text
        kconfig: gconfig: drop the Show Debug Info help text
      12a5336c
    • Miguel Ojeda's avatar
      kbuild: rust: avoid creating temporary files · df01b7cf
      Miguel Ojeda authored
      
      
      `rustc` outputs by default the temporary files (i.e. the ones saved
      by `-Csave-temps`, such as `*.rcgu*` files) in the current working
      directory when `-o` and `--out-dir` are not given (even if
      `--emit=x=path` is given, i.e. it does not use those for temporaries).
      
      Since out-of-tree modules are compiled from the `linux` tree,
      `rustc` then tries to create them there, which may not be accessible.
      
      Thus pass `--out-dir` explicitly, even if it is just for the temporary
      files.
      
      Similarly, do so for Rust host programs too.
      
      Reported-by: default avatarRaphael Nestler <raphael.nestler@gmail.com>
      Closes: https://github.com/Rust-for-Linux/linux/issues/1015
      Reported-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs
      Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs
      Fixes: 295d8398
      
       ("kbuild: specify output names separately for each emission type from rustc")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Tested-by: default avatarMartin Rodriguez Reboredo <yakoyoku@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      df01b7cf
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 269f4a4b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Avoid pKVM finalization if KVM initialization fails
      
         - Add missing BTI instructions in the hypervisor, fixing an early
           boot failure on BTI systems
      
         - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
         - Work around a bug in the architecture where hypervisor timer
           controls have UNKNOWN behavior under nested virt
      
         - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
           BUG in cpu hotplug resulting from per-CPU accessor sanity checking
      
         - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
           consistently requesting a doorbell interrupt on vcpu_put()
      
         - Uphold RES0 sysreg behavior when emulating older PMU versions
      
         - Avoid macro expansion when initializing PMU register names,
           ensuring the tracepoints pretty-print the sysreg
      
        s390:
      
         - Two fixes for asynchronous destroy
      
        x86 fixes will come early next week"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: pv: fix index value of replaced ASCE
        KVM: s390: pv: simplify shutdown and fix race
        KVM: arm64: Fix the name of sys_reg_desc related to PMU
        KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
        KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
        KVM: arm64: Add missing BTI instructions
        KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
        KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
        KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
        KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
      269f4a4b
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 15b593ba
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
        checkpoint code"
      
      * tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
        ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
        ext4: correct inline offset when handling xattrs in inode body
        jbd2: remove __journal_try_to_free_buffer()
        jbd2: fix a race when checking checkpoint buffer busy
        jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
        jbd2: remove journal_clean_one_cp_list()
        jbd2: remove t_checkpoint_io_list
        jbd2: recheck chechpointing non-dirty buffer
      15b593ba
    • Linus Torvalds's avatar
      Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6 · 8266f53b
      Linus Torvalds authored
      Pull smb client fix from Steve French:
       "Add minor debugging improvement.
      
        The change improves ability to read a network trace to debug problems
        on encrypted connections which are very common (e.g. using wireshark
        or tcpdump).
      
        That works today with tools like 'smbinfo keys /mnt/file' but requires
        passing in a filename on the mount (see e.g. [1]), but it often makes
        more sense to just pass in the mount point path (ie a directory not a
        filename).
      
        So this fix was needed to debug some types of problems (an obvious
        example is on an encrypted connection failing operations on an empty
        share or with no files in the root of the directory) - so you can
        simply pass in the 'smbinfo keys <mntpoint>' and get the information
        that wireshark needs"
      
      Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1]
      
      * tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number for cifs.ko
        cifs: allow dumping keys for directories too
      8266f53b
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-master-6.5-1' of... · 0c189708
      Paolo Bonzini authored
      Merge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      Two fixes for asynchronous destroy
      0c189708
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.5-1' of... · 675a15f4
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.5, part #1
      
       - Avoid pKVM finalization if KVM initialization fails
      
       - Add missing BTI instructions in the hypervisor, fixing an early boot
         failure on BTI systems
      
       - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
       - Work around a bug in the architecture where hypervisor timer controls
         have UNKNOWN behavior under nested virt.
      
       - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG
         in cpu hotplug resulting from per-CPU accessor sanity checking.
      
       - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
         consistently requesting a doorbell interrupt on vcpu_put()
      
       - Uphold RES0 sysreg behavior when emulating older PMU versions
      
       - Avoid macro expansion when initializing PMU register names, ensuring
         the tracepoints pretty-print the sysreg.
      675a15f4
  3. Jul 23, 2023
    • Mohamed Khalfella's avatar
      tracing/histograms: Return an error if we fail to add histogram to hist_vars list · 4b8b3905
      Mohamed Khalfella authored
      Commit 6018b585 ("tracing/histograms: Add histograms to hist_vars if
      they have referenced variables") added a check to fail histogram creation
      if save_hist_vars() failed to add histogram to hist_vars list. But the
      commit failed to set ret to failed return code before jumping to
      unregister histogram, fix it.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com
      
      Cc: stable@vger.kernel.org
      Fixes: 6018b585
      
       ("tracing/histograms: Add histograms to hist_vars if they have referenced variables")
      Signed-off-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      4b8b3905
    • Chen Lin's avatar
      ring-buffer: Do not swap cpu_buffer during resize process · 8a96c028
      Chen Lin authored
      
      
      When ring_buffer_swap_cpu was called during resize process,
      the cpu buffer was swapped in the middle, resulting in incorrect state.
      Continuing to run in the wrong state will result in oops.
      
      This issue can be easily reproduced using the following two scripts:
      /tmp # cat test1.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
               echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
               echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
      done
      /tmp # cat test2.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
              echo irqsoff > /sys/kernel/debug/tracing/current_tracer
              sleep 1
              echo nop > /sys/kernel/debug/tracing/current_tracer
              sleep 1
      done
      /tmp # ./test1.sh &
      /tmp # ./test2.sh &
      
      A typical oops log is as follows, sometimes with other different oops logs.
      
      [  231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
      [  231.713375] Modules linked in:
      [  231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
      [  231.716750] Hardware name: linux,dummy-virt (DT)
      [  231.718152] Workqueue: events update_pages_handler
      [  231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  231.721171] pc : rb_update_pages+0x378/0x3f8
      [  231.722212] lr : rb_update_pages+0x25c/0x3f8
      [  231.723248] sp : ffff800082b9bd50
      [  231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
      [  231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
      [  231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
      [  231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
      [  231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
      [  231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
      [  231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
      [  231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
      [  231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
      [  231.744196] Call trace:
      [  231.744892]  rb_update_pages+0x378/0x3f8
      [  231.745893]  update_pages_handler+0x1c/0x38
      [  231.746893]  process_one_work+0x1f0/0x468
      [  231.747852]  worker_thread+0x54/0x410
      [  231.748737]  kthread+0x124/0x138
      [  231.749549]  ret_from_fork+0x10/0x20
      [  231.750434] ---[ end trace 0000000000000000 ]---
      [  233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [  233.721696] Mem abort info:
      [  233.721935]   ESR = 0x0000000096000004
      [  233.722283]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  233.722596]   SET = 0, FnV = 0
      [  233.722805]   EA = 0, S1PTW = 0
      [  233.723026]   FSC = 0x04: level 0 translation fault
      [  233.723458] Data abort info:
      [  233.723734]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
      [  233.724176]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      [  233.724589]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [  233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
      [  233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
      [  233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
      [  233.726720] Modules linked in:
      [  233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
      [  233.727777] Hardware name: linux,dummy-virt (DT)
      [  233.728225] Workqueue: events update_pages_handler
      [  233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  233.729054] pc : rb_update_pages+0x1a8/0x3f8
      [  233.729334] lr : rb_update_pages+0x154/0x3f8
      [  233.729592] sp : ffff800082b9bd50
      [  233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
      [  233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
      [  233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
      [  233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
      [  233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
      [  233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
      [  233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
      [  233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
      [  233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
      [  233.734418] Call trace:
      [  233.734593]  rb_update_pages+0x1a8/0x3f8
      [  233.734853]  update_pages_handler+0x1c/0x38
      [  233.735148]  process_one_work+0x1f0/0x468
      [  233.735525]  worker_thread+0x54/0x410
      [  233.735852]  kthread+0x124/0x138
      [  233.736064]  ret_from_fork+0x10/0x20
      [  233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
      [  233.736959] ---[ end trace 0000000000000000 ]---
      
      After analysis, the seq of the error is as follows [1-5]:
      
      int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
      			int cpu_id)
      {
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//1. get cpu_buffer, aka cpu_buffer(A)
      		...
      		...
      		schedule_work_on(cpu,
      		 &cpu_buffer->update_pages_work);
      		//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
      		// update_pages_handler, do the update process, set 'update_done' in
      		// complete(&cpu_buffer->update_done) and to wakeup resize process.
      	//---->
      		//3. Just at this moment, ring_buffer_swap_cpu is triggered,
      		//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
      		//ring_buffer_swap_cpu is called as the 'Call trace' below.
      
      		Call trace:
      		 dump_backtrace+0x0/0x2f8
      		 show_stack+0x18/0x28
      		 dump_stack+0x12c/0x188
      		 ring_buffer_swap_cpu+0x2f8/0x328
      		 update_max_tr_single+0x180/0x210
      		 check_critical_timing+0x2b4/0x2c8
      		 tracer_hardirqs_on+0x1c0/0x200
      		 trace_hardirqs_on+0xec/0x378
      		 el0_svc_common+0x64/0x260
      		 do_el0_svc+0x90/0xf8
      		 el0_svc+0x20/0x30
      		 el0_sync_handler+0xb0/0xb8
      		 el0_sync+0x180/0x1c0
      	//<----
      
      	/* wait for all the updates to complete */
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
      		//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
      		//for example, cpu_buffer(A)->update_done will leave be set 1, and will
      		//not 'wait_for_completion' at the next resize round.
      		  if (!cpu_buffer->nr_pages_to_update)
      			continue;
      
      		if (cpu_online(cpu))
      			wait_for_completion(&cpu_buffer->update_done);
      		cpu_buffer->nr_pages_to_update = 0;
      	}
      	...
      }
      	//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
      	//Continuing to run in the wrong state, then oops occurs.
      
      Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn
      
      Signed-off-by: default avatarChen Lin <chen.lin5@zte.com.cn>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      8a96c028
    • YueHaibing's avatar
      tracing: Remove unused extern declaration tracing_map_set_field_descr() · 1faf7e4a
      YueHaibing authored
      Since commit 08d43a5f
      
       ("tracing: Add lock-free tracing_map"),
      this is never used, so can be removed.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com
      
      Cc: <mhiramat@kernel.org>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1faf7e4a
    • Alexey Dobriyan's avatar
      kbuild: flatten KBUILD_CFLAGS · 0817d259
      Alexey Dobriyan authored
      
      
      Make it slightly easier to see which compiler options are added and
      removed (and not worry about column limit too!).
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      0817d259
    • Benjamin Gray's avatar
      gen_compile_commands: add assembly files to compilation database · 1c679214
      Benjamin Gray authored
      
      
      Like C source files, tooling can find it useful to have the assembly
      source file compilation recorded.
      
      The .S extension appears to used across all architectures.
      
      Signed-off-by: default avatarBenjamin Gray <bgray@linux.ibm.com>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1c679214
    • Ojaswin Mujoo's avatar
      ext4: fix rbtree traversal bug in ext4_mb_use_preallocated · 9d3de7ee
      Ojaswin Mujoo authored
      During allocations, while looking for preallocations(PA) in the per
      inode rbtree, we can't do a direct traversal of the tree because
      ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
      and that can cause direct traversal to skip some entries. This was
      leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
      our request and ultimately tried to create a new PA that would overlap
      with the missed one.
      
      To makes sure we handle that case while still keeping the performance of
      the rbtree, we make use of the fact that the only pa that could possibly
      overlap the original goal start is the one that satisfies the below
      conditions:
      
        1. It must have it's logical start immediately to the left of
        (ie less than) original logical start.
      
        2. It must not be deleted
      
      To find this pa we use the following traversal method:
      
      1. Descend into the rbtree normally to find the immediate neighboring
      PA. Here we keep descending irrespective of if the PA is deleted or if
      it overlaps with our request etc. The goal is to find an immediately
      adjacent PA.
      
      2. If the found PA is on right of original goal, use rb_prev() to find
      the left adjacent PA.
      
      3. Check if this PA is deleted and keep moving left with rb_prev() until
      a non deleted PA is found.
      
      4. This is the PA we are looking for. Now we can check if it can satisfy
      the original request and proceed accordingly.
      
      This approach also takes care of having deleted PAs in the tree.
      
      (While we are at it, also fix a possible overflow bug in calculating the
      end of a PA)
      
      [1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/
      
      Cc: stable@kernel.org # 6.4
      Fixes: 38727786
      
       ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Reviewed-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Tested-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      9d3de7ee
    • Ojaswin Mujoo's avatar
      ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail() · 5d5460fa
      Ojaswin Mujoo authored
      
      
      In ext4_mb_choose_next_group_best_avail(), we want the start order to be
      1 less than goal length and the min_order to be, at max, 1 more than the
      original length. This commit fixes an off by one issue that arose due to
      the fact that 1 << fls(n) > (n).
      
      After all the processing:
      
      order = 1 order below goal len
      min_order = maximum of the three:-
                   - order - trim_order
                   - 1 order below B2C(s_stripe)
                   - 1 order above original len
      
      Cc: stable@kernel.org
      Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      5d5460fa
    • Eric Whitney's avatar
      ext4: correct inline offset when handling xattrs in inode body · 6909cf5c
      Eric Whitney authored
      
      
      When run on a file system where the inline_data feature has been
      enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
      to emit error messages indicating that inline directory entries are
      corrupted.  This occurs because the inline offset used to locate
      inline directory entries in the inode body is not updated when an
      xattr in that shared region is deleted and the region is shifted in
      memory to recover the space it occupied.  If the deleted xattr precedes
      the system.data attribute, which points to the inline directory entries,
      that attribute will be moved further up in the region.  The inline
      offset continues to point to whatever is located in system.data's former
      location, with unfortunate effects when used to access directory entries
      or (presumably) inline data in the inode body.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6909cf5c
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c2782531
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Reinstate support for little endian ELFv1 binaries, which it turns
         out still exist in the wild.
      
       - Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
         lead to dead code generation and seemed to trigger compiler bugs in
         some edge cases.
      
       - Fix a deadlock in the pseries VAS code, between live migration and
         the driver's mmap handler.
      
       - Disable KCOV instrumentation in the powerpc KASAN code.
      
      Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
      Myneni, Russell Currey, and Uwe Kleine-König.
      
      * tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
        powerpc/kasan: Disable KCOV in KASAN code
        powerpc/512x: lpbfifo: Convert to platform remove callback returning void
        powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
        Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
        powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
      c2782531
    • Steve French's avatar
      cifs: update internal module version number for cifs.ko · ba61a03a
      Steve French authored
      
      
      From 2.43 to 2.44
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      ba61a03a
    • Shyam Prasad N's avatar
      cifs: allow dumping keys for directories too · b3edef6b
      Shyam Prasad N authored
      
      
      Dumping the enc/dec keys is a session wide operation.
      And it should not matter if the ioctl was run on
      a regular file or a directory.
      
      Currently, we obtain the tcon pointer from the
      cifs file handle. But since there's no dir open call
      in cifs, this is not populated for dirs.
      
      This change allows dumping of session keys using ioctl
      even for directories. To do this, we'll now get the
      tcon pointer from the superblock, and not from the file
      handle.
      
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      b3edef6b
    • Linus Torvalds's avatar
      Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 295e1388
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR)
         check to fault handler to prevent error handling for return values
         that don't indicate an error
      
       - Use kfree_sensitive() instead of kfree() in paes crypto code to clear
         memory that may contain keys before freeing it
      
       - Fix reply buffer size calculation for CCA replies in zcrypt device
         driver
      
      * tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/zcrypt: fix reply buffer calculations for CCA replies
        s390/crypto: use kfree_sensitive() instead of kfree()
        s390/mm: fix per vma lock fault handling
      295e1388
    • Linus Torvalds's avatar
      Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux · f036d67c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix for loop regressions (Mauricio)
      
       - Fix a potential stall with batched wakeups in sbitmap (David)
      
       - Fix for stall with recursive plug flushes (Ross)
      
       - Skip accounting of empty requests for blk-iocost (Chengming)
      
       - Remove a dead field in struct blk_mq_hw_ctx (Chengming)
      
      * tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux:
        loop: do not enforce max_loop hard limit by (new) default
        loop: deprecate autoloading callback loop_probe()
        sbitmap: fix batching wakeup
        blk-iocost: skip empty flush bio in iocost
        blk-mq: delete dead struct blk_mq_hw_ctx->queued field
        blk-mq: Fix stall due to recursive flush plug
      f036d67c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux · bdd1d82e
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix for io-wq not always honoring REQ_F_NOWAIT, if it was set and
         punted directly (eg via DRAIN) (me)
      
       - Capability check fix (Ondrej)
      
       - Regression fix for the mmap changes that went into 6.4, which
         apparently broke IA64 (Helge)
      
      * tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux:
        ia64: mmap: Consider pgoff when searching for free mapping
        io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
        io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
        io_uring: don't audit the capability check in io_uring_create()
      bdd1d82e
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 725d444d
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix moortec,mr75203 schema usage of 'multipleOf' keyword
      
       - Fix regression in systems depending on "of-display" device name
      
       - Build fix for s390 with CONFIG_PCI=n and OF_EARLY_FLATTREE=y
      
       - Drop two obsolete serial .txt bindings
      
      * tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt
        dt-bindings: serial: Remove obsolete cavium-uart.txt
        dt-bindings: hwmon: moortec,mr75203: fix multipleOf for coefficients
        of: Preserve "of-display" device name for compatibility
        of: make OF_EARLY_FLATTREE depend on HAS_IOMEM
      725d444d