Skip to content
  1. Feb 17, 2021
  2. Feb 10, 2021
  3. Feb 01, 2021
    • Kan Liang's avatar
      perf/x86/intel: Support CPUID 10.ECX to disable fixed counters · 32451614
      Kan Liang authored
      
      
      With Architectural Performance Monitoring Version 5, CPUID 10.ECX cpu
      leaf indicates the fixed counter enumeration. This extends the previous
      count to a bitmap which allows disabling even lower fixed counters.
      It could be used by a Hypervisor.
      
      The existing intel_ctrl variable is used to remember the bitmask of the
      counters. All code that reads all counters is fixed to check this extra
      bitmask.
      
      Suggested-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Originally-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1611873611-156687-6-git-send-email-kan.liang@linux.intel.com
      32451614
    • Kan Liang's avatar
      perf/x86/intel: Add perf core PMU support for Sapphire Rapids · 61b985e3
      Kan Liang authored
      
      
      Add perf core PMU support for the Intel Sapphire Rapids server, which is
      the successor of the Intel Ice Lake server. The enabling code is based
      on Ice Lake, but there are several new features introduced.
      
      The event encoding is changed and simplified, e.g., the event codes
      which are below 0x90 are restricted to counters 0-3. The event codes
      which above 0x90 are likely to have no restrictions. The event
      constraints, extra_regs(), and hardware cache events table are changed
      accordingly.
      
      A new Precise Distribution (PDist) facility is introduced, which
      further minimizes the skid when a precise event is programmed on the GP
      counter 0. Enable the Precise Distribution (PDist) facility with :ppp
      event. For this facility to work, the period must be initialized with a
      value larger than 127. Add spr_limit_period() to apply the limit for
      :ppp event.
      
      Two new data source fields, data block & address block, are added in the
      PEBS Memory Info Record for the load latency event. To enable the
      feature,
      - An auxiliary event has to be enabled together with the load latency
        event on Sapphire Rapids. A new flag PMU_FL_MEM_LOADS_AUX is
        introduced to indicate the case. A new event, mem-loads-aux, is
        exposed to sysfs for the user tool.
        Add a check in hw_config(). If the auxiliary event is not detected,
        return an unique error -ENODATA.
      - The union perf_mem_data_src is extended to support the new fields.
      - Ice Lake and earlier models do not support block information, but the
        fields may be set by HW on some machines. Add pebs_no_block to
        explicitly indicate the previous platforms which don't support the new
        block fields. Accessing the new block fields are ignored on those
        platforms.
      
      A new store Latency facility is introduced, which leverages the PEBS
      facility where it can provide additional information about sampled
      stores. The additional information includes the data address, memory
      auxiliary info (e.g. Data Source, STLB miss) and the latency of the
      store access. To enable the facility, the new event (0x02cd) has to be
      programed on the GP counter 0. A new flag PERF_X86_EVENT_PEBS_STLAT is
      introduced to indicate the event. The store_latency_data() is introduced
      to parse the memory auxiliary info.
      
      The layout of access latency field of PEBS Memory Info Record has been
      changed. Two latency, instruction latency (bit 15:0) and cache access
      latency (bit 47:32) are recorded.
      - The cache access latency is similar to previous memory access latency.
        For loads, the latency starts by the actual cache access until the
        data is returned by the memory subsystem.
        For stores, the latency starts when the demand write accesses the L1
        data cache and lasts until the cacheline write is completed in the
        memory subsystem.
        The cache access latency is stored in low 32bits of the sample type
        PERF_SAMPLE_WEIGHT_STRUCT.
      - The instruction latency starts by the dispatch of the load operation
        for execution and lasts until completion of the instruction it belongs
        to.
        Add a new flag PMU_FL_INSTR_LATENCY to indicate the instruction
        latency support. The instruction latency is stored in the bit 47:32
        of the sample type PERF_SAMPLE_WEIGHT_STRUCT.
      
      Extends the PERF_METRICS MSR to feature TMA method level 2 metrics. The
      lower half of the register is the TMA level 1 metrics (legacy). The
      upper half is also divided into four 8-bit fields for the new level 2
      metrics. Expose all eight Topdown metrics events to user space.
      
      The full description for the SPR features can be found at Intel
      Architecture Instruction Set Extensions and Future Features
      Programming Reference, 319433-041.
      
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1611873611-156687-5-git-send-email-kan.liang@linux.intel.com
      61b985e3
    • Kan Liang's avatar
      perf/x86/intel: Filter unsupported Topdown metrics event · 1ab5f235
      Kan Liang authored
      
      
      Intel Sapphire Rapids server will introduce 8 metrics events. Intel
      Ice Lake only supports 4 metrics events. A perf tool user may mistakenly
      use the unsupported events via RAW format on Ice Lake. The user can
      still get a value from the unsupported Topdown metrics event once the
      following Sapphire Rapids enabling patch is applied.
      
      To enable the 8 metrics events on Intel Sapphire Rapids, the
      INTEL_TD_METRIC_MAX has to be updated, which impacts the
      is_metric_event(). The is_metric_event() is a generic function.
      On Ice Lake, the newly added SPR metrics events will be mistakenly
      accepted as metric events on creation. At runtime, the unsupported
      Topdown metrics events will be updated.
      
      Add a variable num_topdown_events in x86_pmu to indicate the available
      number of the Topdown metrics event on the platform. Apply the number
      into is_metric_event(). Only the supported Topdown metrics events
      should be created as metrics events.
      
      Apply the num_topdown_events in icl_update_topdown_event() as well. The
      function can be reused by the following patch.
      
      Suggested-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1611873611-156687-4-git-send-email-kan.liang@linux.intel.com
      1ab5f235
    • Kan Liang's avatar
      perf/x86/intel: Factor out intel_update_topdown_event() · 628d923a
      Kan Liang authored
      
      
      Similar to Ice Lake, Intel Sapphire Rapids server also supports the
      topdown performance metrics feature. The difference is that Intel
      Sapphire Rapids server extends the PERF_METRICS MSR to feature TMA
      method level two metrics, which will introduce 8 metrics events. Current
      icl_update_topdown_event() only check 4 level one metrics events.
      
      Factor out intel_update_topdown_event() to facilitate the code sharing
      between Ice Lake and Sapphire Rapids.
      
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1611873611-156687-3-git-send-email-kan.liang@linux.intel.com
      628d923a
    • Kan Liang's avatar
      perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT · 2a6c6b7d
      Kan Liang authored
      
      
      Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the
      cost of an action represented by the sample. This allows the profiler
      to scale the samples to be more informative to the programmer. It could
      also help to locate a hotspot, e.g., when profiling by memory latencies,
      the expensive load appear higher up in the histograms. But current
      PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This
      could be a problem, if users want two or more factors to contribute to
      the weight. For example, Golden Cove core PMU can provide both the
      instruction latency and the cache Latency information as factors for the
      memory profiling.
      
      For current X86 platforms, although meminfo::latency is defined as a
      u64, only the lower 32 bits include the valid data in practice (No
      memory access could last than 4G cycles). The higher 32 bits can be used
      to store new factors.
      
      Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new
      sample weight structure. It shares the same space as the
      PERF_SAMPLE_WEIGHT sample type.
      
      Users can apply either the PERF_SAMPLE_WEIGHT sample type or the
      PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but
      they cannot apply both sample types simultaneously.
      
      Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type.
      - For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT
        sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT
        sample type. PowerPC can re-struct the weight field similarly later.
      - For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT
        sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now.
        The following patches will apply the new factors for the
        PERF_SAMPLE_WEIGHT_STRUCT sample type.
      
      The field in the union perf_sample_weight should be shared among
      different architectures. A generic name is required, but it's hard to
      abstract a name that applies to all architectures. For example, on X86,
      the fields are to store all kinds of latency. While on PowerPC, it
      stores MMCRA[TECX/TECM], which should not be latency. So a general name
      prefix 'var$NUM' is used here.
      
      Suggested-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
      2a6c6b7d
  4. Jan 28, 2021
  5. Jan 14, 2021
  6. Jan 05, 2021
  7. Jan 04, 2021
  8. Jan 03, 2021
    • Linus Torvalds's avatar
      Merge tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3516bd72
      Linus Torvalds authored
      Pull s390 cleanups from Vasily Gorbik:
       "Update defconfigs and sort config select list"
      
      * tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/Kconfig: sort config S390 select list once again
        s390: update defconfigs
      3516bd72
    • Linus Torvalds's avatar
      Merge tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d9296a7b
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a crash in intel_pstate during resume from suspend-to-RAM
        that may occur after recent changes and two resource leaks in error
        paths in the operating performance points (OPP) framework, add a new
        C-states table to intel_idle and update the cpuidle MAINTAINERS entry
        to cover the governors too.
      
        Specifics:
      
         - Fix recently introduced crash in the intel_pstate driver that
           occurs if scale-invariance is disabled during resume from
           suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR
           values made by the platform firmware (Rafael Wysocki).
      
         - Fix a memory leak and add a missing clk_put() in error paths in the
           OPP framework (Quanyang Wang, Viresh Kumar).
      
         - Add new C-states table for SnowRidge processors to the intel_idle
           driver (Artem Bityutskiy).
      
         - Update the MAINTAINERS entry for cpuidle to make it clear that the
           governors are covered by it too (Lukas Bulwahn)"
      
      * tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        intel_idle: add SnowRidge C-state table
        cpufreq: intel_pstate: Fix fast-switch fallback path
        opp: Call the missing clk_put() on error
        opp: fix memory leak in _allocate_opp_table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      d9296a7b
  9. Jan 02, 2021
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-cpuidle' · 89ecf09e
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: intel_pstate: Fix fast-switch fallback path
      
      * pm-cpuidle:
        intel_idle: add SnowRidge C-state table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      89ecf09e
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · eda809ae
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).
      
        The big core two fixes are for power management ("block: Do not accept
        any requests while suspended" and "block: Fix a race in the runtime
        power management code") which finally sorts out the resume problems
        we've occasionally been having.
      
        To make the resume fix, there are seven necessary precursors which
        effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
        in block is automatically a power management exempt one.
      
        All of the non-PM preempt cases are removed except for the one in the
        SCSI Parallel Interface (spi) domain validation which is a genuine
        case where we have to run requests at high priority to validate the
        bus so this becomes an autopm get/put protected request"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
        scsi: cxgb4i: Fix TLS dependency
        scsi: ufs: Un-inline ufshcd_vops_device_reset function
        scsi: ufs: Re-enable WriteBooster after device reset
        scsi: ufs-mediatek: Use correct path to fix compile error
        scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
        scsi: block: Do not accept any requests while suspended
        scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
        scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
        scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
        scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
        scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
        scsi: block: Introduce BLK_MQ_REQ_PM
        scsi: block: Fix a race in the runtime power management code
        scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
        scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
        scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
        scsi: ufs-pci: Fix restore from S4 for Intel controllers
        scsi: ufs-mediatek: Keep VCC always-on for specific devices
        scsi: ufs: Allow regulators being always-on
        scsi: ufs: Clear UAC for RPMB after ufshcd resets
        ...
      eda809ae
    • Linus Torvalds's avatar
      Merge tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block · 8b4805c6
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Two minor block fixes from this last week that should go into 5.11:
      
         - Add missing NOWAIT debugfs definition (Andres)
      
         - Fix kerneldoc warning introduced this merge window (Randy)"
      
      * tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        block: add debugfs stanza for QUEUE_FLAG_NOWAIT
        fs: block_dev.c: fix kernel-doc warnings from struct block_device changes
      8b4805c6
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block · dc3e24b2
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A few fixes that should go into 5.11, all marked for stable as well:
      
         - Fix issue around identity COW'ing and users that share a ring
           across processes
      
         - Fix a hang associated with unregistering fixed files (Pavel)
      
         - Move the 'process is exiting' cancelation a bit earlier, so
           task_works aren't affected by it (Pavel)"
      
      * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        kernel/io_uring: cancel io_uring before task works
        io_uring: fix io_sqe_files_unregister() hangs
        io_uring: add a helper for setting a ref node
        io_uring: don't assume mm is constant across submits
      dc3e24b2
    • Linus Torvalds's avatar
      depmod: handle the case of /sbin/depmod without /sbin in PATH · cedd1862
      Linus Torvalds authored
      Commit 436e980e
      
       ("kbuild: don't hardcode depmod path") stopped
      hard-coding the path of depmod, but in the process caused trouble for
      distributions that had that /sbin location, but didn't have it in the
      PATH (generally because /sbin is limited to the super-user path).
      
      Work around it for now by just adding /sbin to the end of PATH in the
      depmod.sh script.
      
      Reported-and-tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cedd1862
  10. Dec 31, 2020
  11. Dec 30, 2020