Commit 2504ba8b authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull power management updates from Rafael Wysocki:
 "These add EPP support to the AMD P-state cpufreq driver, add support
  for new platforms to the Intel RAPL power capping driver, intel_idle
  and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
  drop the custom cpufreq driver for loongson1 that is not necessary any
  more (and the corresponding cpufreq platform device), fix assorted
  issues and clean up code.

  Specifics:

   - Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
     Karny, Arnd Bergmann, Bagas Sanjaya)

   - Drop the custom cpufreq driver for loongson1 that is not necessary
     any more and the corresponding cpufreq platform device (Keguang
     Zhang)

   - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
     entries (Paul E. McKenney)

   - Enable thermal cooling for Tegra194 (Yi-Wei Wang)

   - Register module device table and add missing compatibles for
     cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)

   - Various dt binding updates for qcom-cpufreq-nvmem and
     opp-v2-kryo-cpu (Christian Marangi)

   - Make kobj_type structure in the cpufreq core constant (Thomas
     Weißschuh)

   - Make cpufreq_unregister_driver() return void (Uwe Kleine-König)

   - Make the TEO cpuidle governor check CPU utilization in order to
     refine idle state selection (Kajetan Puchalski)

   - Make Kconfig select the haltpoll cpuidle governor when the haltpoll
     cpuidle driver is selected and replace a default_idle() call in
     that driver with arch_cpu_idle() to allow MWAIT to be used (Li
     RongQing)

   - Add Emerald Rapids Xeon support to the intel_idle driver (Artem
     Bityutskiy)

   - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
     avoid randconfig build failures (Arnd Bergmann)

   - Make kobj_type structures used in the cpuidle sysfs interface
     constant (Thomas Weißschuh)

   - Make the cpuidle driver registration code update microsecond values
     of idle state parameters in accordance with their nanosecond values
     if they are provided (Rafael Wysocki)

   - Make the PSCI cpuidle driver prevent topology CPUs from being
     suspended on PREEMPT_RT (Krzysztof Kozlowski)

   - Document that pm_runtime_force_suspend() cannot be used with
     DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)

   - Add EXPORT macros for exporting PM functions from drivers (Richard
     Fitzgerald)

   - Remove /** from non-kernel-doc comments in hibernation code (Randy
     Dunlap)

   - Fix possible name leak in powercap_register_zone() (Yang Yingliang)

   - Add Meteor Lake and Emerald Rapids support to the intel_rapl power
     capping driver (Zhang Rui)

   - Modify the idle_inject power capping facility to support 100% idle
     injection (Srinivas Pandruvada)

   - Fix large time windows handling in the intel_rapl power capping
     driver (Zhang Rui)

   - Fix memory leaks with using debugfs_lookup() in the generic PM
     domains and Energy Model code (Greg Kroah-Hartman)

   - Add missing 'cache-unified' property in the example for kryo OPP
     bindings (Rob Herring)

   - Fix error checking in opp_migrate_dentry() (Qi Zheng)

   - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
     Dybcio)

   - Modify some power management utilities to use the canonical ftrace
     path (Ross Zwisler)

   - Correct spelling problems for Documentation/power/ as reported by
     codespell (Randy Dunlap)"

* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
  Documentation: amd-pstate: disambiguate user space sections
  cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
  dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
  PM: Add EXPORT macros for exporting PM functions
  cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
  MIPS: loongson32: Drop obsolete cpufreq platform device
  powercap: intel_rapl: Fix handling for large time window
  cpuidle: driver: Update microsecond values of state parameters as needed
  cpuidle: sysfs: make kobj_type structures constant
  cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
  PM: EM: fix memory leak with using debugfs_lookup()
  PM: domains: fix memory leak with using debugfs_lookup()
  cpufreq: Make kobj_type structure constant
  cpufreq: davinci: Fix clk use after free
  cpufreq: amd-pstate: avoid uninitialized variable use
  cpufreq: Make cpufreq_unregister_driver() return void
  OPP: fix error checking in opp_migrate_dentry()
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
  ...
parents 4a7d37e8 dd855f01
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -7041,3 +7041,10 @@
			  management firmware translates the requests into actual
			  hardware states (core frequency, data fabric and memory
			  clocks etc.)
			active
			  Use amd_pstate_epp driver instance as the scaling driver,
			  driver provides a hint to the hardware if software wants
			  to bias toward performance (0x0) or energy efficiency (0xff)
			  to the CPPC firmware. then CPPC power algorithm will
			  calculate the runtime workload and adjust the realtime cores
			  frequency.
+74 −4
Original line number Diff line number Diff line
@@ -230,8 +230,8 @@ with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
to the request from AMD P-States.


User Space Interface in ``sysfs``
==================================
User Space Interface in ``sysfs`` - Per-policy control
======================================================

``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
control its functionality at the system level. They are located in the
@@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
<perf_cap_>`_.)
This attribute is read-only.

``energy_performance_available_preferences``

A list of all the supported EPP preferences that could be used for
``energy_performance_preference`` on this system.
These profiles represent different hints that are provided
to the low-level firmware about the user's desired energy vs efficiency
tradeoff.  ``default`` represents the epp value is set by platform
firmware. This attribute is read-only.

``energy_performance_preference``

The current energy performance preference can be read from this attribute.
and user can change current preference according to energy or performance needs
Please get all support profiles list from
``energy_performance_available_preferences`` attribute, all the profiles are
integer values defined between 0 to 255 when EPP feature is enabled by platform
firmware, if EPP feature is disabled, driver will ignore the written value
This attribute is read-write.

Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.

@@ -280,8 +299,30 @@ module which supports the new AMD P-States mechanism on most of the future AMD
platforms. The AMD P-States mechanism is the more performance and energy
efficiency frequency management method on AMD processors.

Kernel Module Options for ``amd-pstate``
=========================================

AMD Pstate Driver Operation Modes
=================================

``amd_pstate`` CPPC has two operation modes: CPPC Autonomous(active) mode and
CPPC non-autonomous(passive) mode.
active mode and passive mode can be chosen by different kernel parameters.
When in Autonomous mode, CPPC ignores requests done in the Desired Performance
Target register and takes into account only the values set to the Minimum requested
performance, Maximum requested performance, and Energy Performance Preference
registers. When Autonomous is disabled, it only considers the Desired Performance Target.

Active Mode
------------

``amd_pstate=active``

This is the low-level firmware control mode which is implemented by ``amd_pstate_epp``
driver with ``amd_pstate=active`` passed to the kernel in the command line.
In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software
wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware.
then CPPC power algorithm will calculate the runtime workload and adjust the realtime
cores frequency according to the power supply and thermal, core voltage and some other
hardware conditions.

Passive Mode
------------
@@ -298,6 +339,35 @@ processor must provide at least nominal performance requested and go higher if c
operating conditions allow.


User Space Interface in ``sysfs`` - General
===========================================

Global Attributes
-----------------

``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
control its functionality at the system level.  They are located in the
``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.

``status``
	Operation mode of the driver: "active", "passive" or "disable".

	"active"
		The driver is functional and in the ``active mode``

	"passive"
		The driver is functional and in the ``passive mode``

	"disable"
		The driver is unregistered and not functional now.

        This attribute can be written to in order to change the driver's
        operation mode or to unregister it.  The string written to it must be
        one of the possible values of it and, if successful, writing one of
        these values to the sysfs file will cause the driver to switch over
        to the operation mode represented by that string - or to be
        unregistered in the "disable" case.

``cpupower`` tool support for ``amd-pstate``
===============================================

+5 −0
Original line number Diff line number Diff line
@@ -26,8 +26,13 @@ properties:
        items:
          - enum:
              - qcom,qdu1000-cpufreq-epss
              - qcom,sc7280-cpufreq-epss
              - qcom,sc8280xp-cpufreq-epss
              - qcom,sm6375-cpufreq-epss
              - qcom,sm8250-cpufreq-epss
              - qcom,sm8350-cpufreq-epss
              - qcom,sm8450-cpufreq-epss
              - qcom,sm8550-cpufreq-epss
          - const: qcom,cpufreq-epss

  reg:
+56 −25
Original line number Diff line number Diff line
@@ -17,6 +17,9 @@ description: |
  on the CPU OPP in use. The CPUFreq driver sets the CPR power domain level
  according to the required OPPs defined in the CPU OPP tables.

  For old implementation efuses are parsed to select the correct opp table and
  voltage and CPR is not supported/used.

select:
  properties:
    compatible:
@@ -33,6 +36,34 @@ select:
  required:
    - compatible

patternProperties:
  '^opp-table(-[a-z0-9]+)?$':
    allOf:
      - if:
          properties:
            compatible:
              const: operating-points-v2-kryo-cpu
        then:
          $ref: /schemas/opp/opp-v2-kryo-cpu.yaml#

      - if:
          properties:
            compatible:
              const: operating-points-v2-qcom-level
        then:
          $ref: /schemas/opp/opp-v2-qcom-level.yaml#

    unevaluatedProperties: false

allOf:
  - if:
      properties:
        compatible:
          contains:
            enum:
              - qcom,qcs404

    then:
      properties:
        cpus:
          type: object
+15 −3
Original line number Diff line number Diff line
@@ -50,12 +50,22 @@ patternProperties:
      opp-supported-hw:
        description: |
          A single 32 bit bitmap value, representing compatible HW.
          Bitmap:
          Bitmap for MSM8996 format:
          0:  MSM8996, speedbin 0
          1:  MSM8996, speedbin 1
          2:  MSM8996, speedbin 2
          3-31:  unused
        maximum: 0x7
          3:  MSM8996, speedbin 3
          4-31:  unused

          Bitmap for MSM8996SG format (speedbin shifted of 4 left):
          0-3:  unused
          4:  MSM8996SG, speedbin 0
          5:  MSM8996SG, speedbin 1
          6:  MSM8996SG, speedbin 2
          7-31:  unused
        enum: [0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7,
               0x9, 0xd, 0xe, 0xf,
               0x10, 0x20, 0x30, 0x70]

      clock-latency-ns: true

@@ -106,6 +116,7 @@ examples:
                L2_0: l2-cache {
                    compatible = "cache";
                    cache-level = <2>;
                    cache-unified;
                };
            };

@@ -140,6 +151,7 @@ examples:
                L2_1: l2-cache {
                    compatible = "cache";
                    cache-level = <2>;
                    cache-unified;
                };
            };

Loading