Skip to content
  1. May 04, 2022
  2. Apr 22, 2022
    • Marco Elver's avatar
      signal: Deliver SIGTRAP on perf event asynchronously if blocked · 78ed93d7
      Marco Elver authored
      With SIGTRAP on perf events, we have encountered termination of
      processes due to user space attempting to block delivery of SIGTRAP.
      Consider this case:
      
          <set up SIGTRAP on a perf event>
          ...
          sigset_t s;
          sigemptyset(&s);
          sigaddset(&s, SIGTRAP | <and others>);
          sigprocmask(SIG_BLOCK, &s, ...);
          ...
          <perf event triggers>
      
      When the perf event triggers, while SIGTRAP is blocked, force_sig_perf()
      will force the signal, but revert back to the default handler, thus
      terminating the task.
      
      This makes sense for error conditions, but not so much for explicitly
      requested monitoring. However, the expectation is still that signals
      generated by perf events are synchronous, which will no longer be the
      case if the signal is blocked and delivered later.
      
      To give user space the ability to clearly distinguish synchronous from
      asynchronous signals, introduce siginfo_t::si_perf_flags and
      TRAP_PERF_FLAG_ASYNC (opted for flags in case more binary information is
      required in future).
      
      The resolution to the problem is then to (a) no longer force the signal
      (avoiding the terminations), but (b) tell user space via si_perf_flags
      if the signal was synchronous or not, so that such signals can be
      handled differently (e.g. let user space decide to ignore or consider
      the data imprecise).
      
      The alternative of making the kernel ignore SIGTRAP on perf events if
      the signal is blocked may work for some usecases, but likely causes
      issues in others that then have to revert back to interception of
      sigprocmask() (which we want to avoid). [ A concrete example: when using
      breakpoint perf events to track data-flow, in a region of code where
      signals are blocked, data-flow can no longer be tracked accurately.
      When a relevant asynchronous signal is received after unblocking the
      signal, the data-flow tracking logic needs to know its state is
      imprecise. ]
      
      Fixes: 97ba62b2
      
       ("perf: Add support for SIGTRAP on perf events")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Link: https://lore.kernel.org/r/20220404111204.935357-1-elver@google.com
      78ed93d7
  3. Apr 05, 2022
    • Yang Jihong's avatar
      perf/x86: Unify format of events sysfs show · 7bebfe9d
      Yang Jihong authored
      
      
      Sysfs show formats of files in /sys/devices/cpu/events/ are not unified,
      some end with "\n", and some do not. Modify sysfs show format of events
      defined by EVENT_ATTR_STR to end with "\n".
      
      Before:
        $ ls /sys/devices/cpu/events/* | xargs -i sh -c 'echo -n "{}: "; cat -A {}; echo'
        branch-instructions: event=0xc4$
      
        branch-misses: event=0xc5$
      
        bus-cycles: event=0x3c,umask=0x01$
      
        cache-misses: event=0x2e,umask=0x41$
      
        cache-references: event=0x2e,umask=0x4f$
      
        cpu-cycles: event=0x3c$
      
        instructions: event=0xc0$
      
        ref-cycles: event=0x00,umask=0x03$
      
        slots: event=0x00,umask=0x4
        topdown-bad-spec: event=0x00,umask=0x81
        topdown-be-bound: event=0x00,umask=0x83
        topdown-fe-bound: event=0x00,umask=0x82
        topdown-retiring: event=0x00,umask=0x80
      
      After:
        $ ls /sys/devices/cpu/events/* | xargs -i sh -c 'echo -n "{}: "; cat -A {}; echo'
        /sys/devices/cpu/events/branch-instructions: event=0xc4$
      
        /sys/devices/cpu/events/branch-misses: event=0xc5$
      
        /sys/devices/cpu/events/bus-cycles: event=0x3c,umask=0x01$
      
        /sys/devices/cpu/events/cache-misses: event=0x2e,umask=0x41$
      
        /sys/devices/cpu/events/cache-references: event=0x2e,umask=0x4f$
      
        /sys/devices/cpu/events/cpu-cycles: event=0x3c$
      
        /sys/devices/cpu/events/instructions: event=0xc0$
      
        /sys/devices/cpu/events/ref-cycles: event=0x00,umask=0x03$
      
        /sys/devices/cpu/events/slots: event=0x00,umask=0x4$
      
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20220324031957.135595-1-yangjihong1@huawei.com
      7bebfe9d
    • Stephane Eranian's avatar
      perf/x86/amd: Add idle hooks for branch sampling · d5616bac
      Stephane Eranian authored
      
      
      On AMD Fam19h Zen3, the branch sampling (BRS) feature must be disabled before
      entering low power and re-enabled (if was active) when returning from low
      power. Otherwise, the NMI interrupt may be held up for too long and cause
      problems. Stopping BRS will cause the NMI to be delivered if it was held up.
      
      Define a perf_amd_brs_lopwr_cb() callback to stop/restart BRS.  The callback
      is protected by a jump label which is enabled only when AMD BRS is detected.
      In all other cases, the callback is never called.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      [peterz: static_call() and build fixes]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-10-eranian@google.com
      d5616bac
    • Stephane Eranian's avatar
      ACPI: Add perf low power callback · 2a606a18
      Stephane Eranian authored
      
      
      Add an optional callback needed by some PMU features, e.g., AMD
      BRS, to give a chance to the perf_events code to change its state before
      a CPU goes to low power and after it comes back.
      
      The callback is void when the PERF_NEEDS_LOPWR_CB flag is not set.
      This flag must be set in arch specific perf_event.h header whenever needed.
      When not set, there is no impact on the ACPI code.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      [peterz: build fix]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-9-eranian@google.com
      2a606a18
    • Stephane Eranian's avatar
      perf/x86/amd: Make Zen3 branch sampling opt-in · cc37e520
      Stephane Eranian authored
      
      
      Add a kernel config option CONFIG_PERF_EVENTS_AMD_BRS
      to make the support for AMD Zen3 Branch Sampling (BRS) an opt-in
      compile time option.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-8-eranian@google.com
      cc37e520
    • Stephane Eranian's avatar
      perf/x86/amd: Add AMD branch sampling period adjustment · ba2fe750
      Stephane Eranian authored
      
      
      Add code to adjust the sampling event period when used with the Branch
      Sampling feature (BRS). Given the depth of the BRS (16), the period is
      reduced by that depth such that in the best case scenario, BRS saturates at
      the desired sampling period. In practice, though, the processor may execute
      more branches. Given a desired period P and a depth D, the kernel programs
      the actual period at P - D. After P occurrences of the sampling event, the
      counter overflows. It then may take X branches (skid) before the NMI is
      caught and held by the hardware and BRS activates. Then, after D branches,
      BRS saturates and the NMI is delivered.  With no skid, the effective period
      would be (P - D) + D = P. In practice, however, it will likely be (P - D) +
      X + D. There is no way to eliminate X or predict X.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-7-eranian@google.com
      ba2fe750
    • Stephane Eranian's avatar
      perf/x86/amd: Enable branch sampling priv level filtering · 8910075d
      Stephane Eranian authored
      
      
      The AMD Branch Sampling features does not provide hardware filtering by
      privilege level. The associated PMU counter does but not the branch sampling
      by itself. Given how BRS operates there is a possibility that BRS captures
      kernel level branches even though the event is programmed to count only at
      the user level.
      
      Implement a workaround in software by removing the branches which belong to
      the wrong privilege level. The privilege level is evaluated on the target of
      the branch and not the source so as to be compatible with other architectures.
      As a consequence of this patch, the number of entries in the
      PERF_RECORD_BRANCH_STACK buffer may be less than the maximum (16).  It could
      even be zero. Another consequence is that consecutive entries in the branch
      stack may not reflect actual code path and may have discontinuities, in case
      kernel branches were suppressed. But this is no different than what happens
      on other architectures.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-6-eranian@google.com
      8910075d
    • Stephane Eranian's avatar
      perf/x86/amd: Add branch-brs helper event for Fam19h BRS · 44175993
      Stephane Eranian authored
      
      
      Add a pseudo event called branch-brs to help use the FAM Fam19h
      Branch Sampling feature (BRS). BRS samples taken branches, so it is best used
      when sampling on a retired taken branch event (0xc4) which is what BRS
      captures.  Instead of trying to remember the event code or actual event name,
      users can simply do:
      
      $ perf record -b -e cpu/branch-brs/ -c 1000037 .....
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-5-eranian@google.com
      44175993
    • Stephane Eranian's avatar
      perf/x86/amd: Add AMD Fam19h Branch Sampling support · ada54345
      Stephane Eranian authored
      
      
      Add support for the AMD Fam19h 16-deep branch sampling feature as
      described in the AMD PPR Fam19h Model 01h Revision B1.  This is a model
      specific extension. It is not an architected AMD feature.
      
      The Branch Sampling (BRS) operates with a 16-deep saturating buffer in MSR
      registers. There is no branch type filtering. All control flow changes are
      captured. BRS relies on specific programming of the core PMU of Fam19h.  In
      particular, the following requirements must be met:
       - the sampling period be greater than 16 (BRS depth)
       - the sampling period must use a fixed and not frequency mode
      
      BRS interacts with the NMI interrupt as well. Because enabling BRS is
      expensive, it is only activated after P event occurrences, where P is the
      desired sampling period.  At P occurrences of the event, the counter
      overflows, the CPU catches the interrupt, activates BRS for 16 branches until
      it saturates, and then delivers the NMI to the kernel.  Between the overflow
      and the time BRS activates more branches may be executed skewing the period.
      All along, the sampling event keeps counting. The skid may be attenuated by
      reducing the sampling period by 16 (subsequent patch).
      
      BRS is integrated into perf_events seamlessly via the same
      PERF_RECORD_BRANCH_STACK sample format. BRS generates perf_branch_entry
      records in the sampling buffer. No prediction information is supported. The
      branches are stored in reverse order of execution.  The most recent branch is
      the first entry in each record.
      
      No modification to the perf tool is necessary.
      
      BRS can be used with any sampling event. However, it is recommended to use
      the RETIRED_BRANCH_INSTRUCTIONS event because it matches what the BRS
      captures.
      
      $ perf record -b -c 1000037 -e cpu/event=0xc2,name=ret_br_instructions/ test
      
      $ perf report -D
      56531696056126 0x193c000 [0x1a8]: PERF_RECORD_SAMPLE(IP, 0x2): 18122/18230: 0x401d24 period: 1000037 addr: 0
      ... branch stack: nr:16
      .....  0: 0000000000401d24 -> 0000000000401d5a 0 cycles      0
      .....  1: 0000000000401d5c -> 0000000000401d24 0 cycles      0
      .....  2: 0000000000401d22 -> 0000000000401d5c 0 cycles      0
      .....  3: 0000000000401d5e -> 0000000000401d22 0 cycles      0
      .....  4: 0000000000401d20 -> 0000000000401d5e 0 cycles      0
      .....  5: 0000000000401d3e -> 0000000000401d20 0 cycles      0
      .....  6: 0000000000401d42 -> 0000000000401d3e 0 cycles      0
      .....  7: 0000000000401d3c -> 0000000000401d42 0 cycles      0
      .....  8: 0000000000401d44 -> 0000000000401d3c 0 cycles      0
      .....  9: 0000000000401d3a -> 0000000000401d44 0 cycles      0
      ..... 10: 0000000000401d46 -> 0000000000401d3a 0 cycles      0
      ..... 11: 0000000000401d38 -> 0000000000401d46 0 cycles      0
      ..... 12: 0000000000401d48 -> 0000000000401d38 0 cycles      0
      ..... 13: 0000000000401d36 -> 0000000000401d48 0 cycles      0
      ..... 14: 0000000000401d4a -> 0000000000401d36 0 cycles      0
      ..... 15: 0000000000401d34 -> 0000000000401d4a 0 cycles      0
       ... thread: test:18230
       ...... dso: test
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-4-eranian@google.com
      ada54345
    • Stephane Eranian's avatar
      x86/cpufeatures: Add AMD Fam19h Branch Sampling feature · a77d41ac
      Stephane Eranian authored
      
      
      Add a cpu feature for AMD Fam19h Branch Sampling feature as bit
      31 of EBX on CPUID leaf function 0x80000008.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-3-eranian@google.com
      a77d41ac
    • Stephane Eranian's avatar
      perf/core: Add perf_clear_branch_entry_bitfields() helper · bfe4daf8
      Stephane Eranian authored
      
      
      Make it simpler to reset all the info fields on the
      perf_branch_entry by adding a helper inline function.
      
      The goal is to centralize the initialization to avoid missing
      a field in case more are added.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-2-eranian@google.com
      bfe4daf8
  4. Apr 04, 2022
    • Linus Torvalds's avatar
      Linux 5.18-rc1 · 31231092
      Linus Torvalds authored
      v5.18-rc1
      31231092
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 09bb8856
      Linus Torvalds authored
      Pull more tracing updates from Steven Rostedt:
      
       - Rename the staging files to give them some meaning. Just
         stage1,stag2,etc, does not show what they are for
      
       - Check for NULL from allocation in bootconfig
      
       - Hold event mutex for dyn_event call in user events
      
       - Mark user events to broken (to work on the API)
      
       - Remove eBPF updates from user events
      
       - Remove user events from uapi header to keep it from being installed.
      
       - Move ftrace_graph_is_dead() into inline as it is called from hot
         paths and also convert it into a static branch.
      
      * tag 'trace-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Move user_events.h temporarily out of include/uapi
        ftrace: Make ftrace_graph_is_dead() a static branch
        tracing: Set user_events to BROKEN
        tracing/user_events: Remove eBPF interfaces
        tracing/user_events: Hold event_mutex during dyn_event_add
        proc: bootconfig: Add null pointer check
        tracing: Rename the staging files for trace_events
      09bb8856
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 34a53ff9
      Linus Torvalds authored
      Pull clk fix from Stephen Boyd:
       "A single revert to fix a boot regression seen when clk_put() started
        dropping rate range requests. It's best to keep various systems
        booting so we'll kick this out and try again next time"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        Revert "clk: Drop the rate range on clk_put()"
      34a53ff9
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8b5656bc
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of x86 fixes and updates:
      
         - Make the prctl() for enabling dynamic XSTATE components correct so
           it adds the newly requested feature to the permission bitmap
           instead of overwriting it. Add a selftest which validates that.
      
         - Unroll string MMIO for encrypted SEV guests as the hypervisor
           cannot emulate it.
      
         - Handle supervisor states correctly in the FPU/XSTATE code so it
           takes the feature set of the fpstate buffer into account. The
           feature sets can differ between host and guest buffers. Guest
           buffers do not contain supervisor states. So far this was not an
           issue, but with enabling PASID it needs to be handled in the buffer
           offset calculation and in the permission bitmaps.
      
         - Avoid a gazillion of repeated CPUID invocations in by caching the
           values early in the FPU/XSTATE code.
      
         - Enable CONFIG_WERROR in x86 defconfig.
      
         - Make the X86 defconfigs more useful by adapting them to Y2022
           reality"
      
      * tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/fpu/xstate: Consolidate size calculations
        x86/fpu/xstate: Handle supervisor states in XSTATE permissions
        x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
        x86/fpu: Cache xfeature flags from CPUID
        x86/fpu/xsave: Initialize offset/size cache early
        x86/fpu: Remove unused supervisor only offsets
        x86/fpu: Remove redundant XCOMP_BV initialization
        x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO
        x86/config: Make the x86 defconfigs a bit more usable
        x86/defconfig: Enable WERROR
        selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
        x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation
      8b5656bc
    • Linus Torvalds's avatar
      Merge tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e235f419
      Linus Torvalds authored
      Pull RT signal fix from Thomas Gleixner:
       "Revert the RT related signal changes. They need to be reworked and
        generalized"
      
      * tag 'core-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"
      e235f419
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping · 63d12cc3
      Linus Torvalds authored
      Pull more dma-mapping updates from Christoph Hellwig:
      
       - fix a regression in dma remap handling vs AMD memory encryption (me)
      
       - finally kill off the legacy PCI DMA API (Christophe JAILLET)
      
      * tag 'dma-mapping-5.18-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: move pgprot_decrypted out of dma_pgprot
        PCI/doc: cleanup references to the legacy PCI DMA API
        PCI: Remove the deprecated "pci-dma-compat.h" API
      63d12cc3
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 5dee8721
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - avoid unnecessary rebuilds for library objects
      
       - fix return value of __setup handlers
      
       - fix invalid input check for "crashkernel=" kernel option
      
       - silence KASAN warnings in unwind_frame
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9191/1: arm/stacktrace, kasan: Silence KASAN warnings in unwind_frame()
        ARM: 9190/1: kdump: add invalid input check for 'crashkernel=0'
        ARM: 9187/1: JIVE: fix return value of __setup handler
        ARM: 9189/1: decompressor: fix unneeded rebuilds of library objects
      5dee8721
  5. Apr 03, 2022
  6. Apr 02, 2022