Unverified Commit 7dfcf66a authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!2481 Introduce PBHA and PBHA bit0 to control the usage of HBM Cache precisely

Merge Pull Request from: @ci-robot 
 
PR sync from: Wupeng Ma <mawupeng1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/XTEXXS6LTDPZLZUMINEOZIKT6R3XSQHX/ 
From: Ma Wupeng <mawupeng1@huawei.com>

Patch 1: move FDT init out of kaslr_early_init for future use.
Patch 2 to 8: enable feature PBHA for arm64.
Patch 9-18: Control the usage of HBM cache for kernel and task precisely.
Patch 19: Enable feature PBHA for arm64 by default.

Changelog since v5:
- Return -EINVAL if pbha bit0 is not enabled in pbha_bit0_update_vma in
  patch #15

Changelog since v4:
- update pbha_bit0_update_pgprot to pgprot_pbha_bit0

Changelog since v3:
- update desc for patch #15

Changelog since v2:
- correct the error Documentation in patch #18

Changelog since v1:
- fix kabi broken due to include files

James Morse (7):
  KVM: arm64: Detect and enable PBHA for stage2
  dt-bindings: Rename the description of cpu nodes cpu.yaml
  dt-bindings: arm: Add binding for Page Based Hardware Attributes
  arm64: cpufeature: Enable PBHA bits for stage1
  arm64: mm: Add pgprot_pbha() to allow drivers to request PBHA values
  KVM: arm64: Configure PBHA bits for stage2
  Documentation: arm64: Describe the support and expectations for PBHA

Ma Wupeng (11):
  arm64: cpufeature: Enable PBHA for stage1 early via FDT
  arm64: mm: Detect and enable PBHA bit0 at early startup
  arm64: mm: Update kernel pte entries if pbha bit0 enabled
  arm64: mm: Show PBHA bit 59 as PBHA0 in ptdump
  arm64: mm: Introduce VM_PBHA_BIT0 to enable pbha bit0 for single vma
  arm64: mm: Set PBHA0 bit for VM_PBHA_BIT0
  arm64: mm: Introduce procfs interface to update PBHA0 bit
  arm64: mm: Set flag VM_PBHA_BIT0 for global init task
  arm64: mm: Introduce prctl to control pbha behavior
  arm64: mm: Introduce kernel param pbha
  openeuler: configs: arm64: Enable PBHA by default

Marc Zyngier (1):
  arm64: Extract early FDT mapping from kaslr_early_init()


-- 
2.25.1
 
https://gitee.com/openeuler/kernel/issues/I7ZC0H 
 
Link:https://gitee.com/openeuler/kernel/pulls/2481

 

Reviewed-by: default avatarWeilong Chen <chenweilong@huawei.com>
Reviewed-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: default avatarZucheng Zheng <zhengzucheng@huawei.com>
Reviewed-by: default avatarXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: default avatarJialin Zhang <zhangjialin11@huawei.com>
parents 1422b1b2 35ed7d91
Loading
Loading
Loading
Loading
+8 −0
Original line number Diff line number Diff line
@@ -6457,3 +6457,11 @@
				memory, and other data can't be written using
				xmon commands.
			off	xmon is disabled.

	pbha=		[ARM64]
			Format: { enable | user }
			Enabled PBHA bit0.
			enable	kernel and user will update PBHA bit0 for their
				pte entry.
			user	only select user task will update PBHA bit0 for
				their pte entry.
+1 −0
Original line number Diff line number Diff line
@@ -17,6 +17,7 @@ ARM64 Architecture
    legacy_instructions
    memory
    memory-tagging-extension
    pbha
    perf
    pointer-authentication
    silicon-errata
+85 −0
Original line number Diff line number Diff line
=======================================================
Page Based Hardware Attribute support for AArch64 Linux
=======================================================

Page Based Hardware Attributes (PBHA) allow the OS to trigger IMPLEMENTATION
DEFINED behaviour associated with a memory access. For example, this may be
taken as a hint to a System Cache whether it should cache the location that
has been accessed.

PBHA consists of four bits in the leaf page table entries for a virtual
address, that are sent with any memory access via that virtual address.

IMPLEMENTATION DEFINED behaviour is not specified by the arm-arm, meaning
it varies between SoCs. There may be unexpected side effects when PBHA
bits are used or combined.
For example, a PBHA bit may be taken as a hint to the Memory Controller that
it should encrypt/decrypt the memory in DRAM. If the CPU has multiple virtual
aliases of the address, accesses that are made without this PBHA bit set may
cause corruption.


Use by virtual machines using KVM
---------------------------------

KVM allows an OS in a virtual machine to configure its own page tables. A
virtual machine can also configure PBHA bits in its page tables. To prevent
side effects that could affect the hypervisor, KVM will only allow
combinations of PBHA bits that only affect performance. Values that cause
changes to the data are forbidden as the Hypervisor and VMM have aliases of
the guest memory, and may swap it to/from disk.

The list of bits to allow is built from the firmware list of PBHA bit
combinations that only affect performance. Because the guest can choose
not to set all the bits in a value, (e.g. allowing 5 implicitly allows 1
and 4), the values supported may differ between a host and guest.

PBHA is only supported for a guest if KVM supports the mechanism the CPU uses
to combine the values from stage1 and stage2 translation. The mechanism is not
advertised, so which mechanism each CPU uses must also be known by the kernel.


Use by device drivers
---------------------

Device drivers should discover the PBHA value to use for a mapping from the
device's firmware description as these will vary between SoCs. If the value
is also listed by firmware as only affecting performance, it can be added to
the pgprot with pgprot_pbha().

Values that require all other aliases to be removed are not supported.


Linux's expectations around PBHA
--------------------------------

'IMPLEMENTATION DEFINED' describes a huge range of possible behaviours.
Linux expects PBHA to behave in the same way as the read/write allocate hints
for a memory type. Below is an incomplete list of expectations:

 * PBHA values have the same meaning for all CPUs in the SoC.
 * Use of the PBHA value does not cause mismatched type, shareability or
   cacheability, it does not take precedence over the stage2 attributes, or
   HCR_EL2 controls.
 * If a PBHA value requires all other aliases to be removed, higher exception
   levels do not have a concurrent alias. (This includes Secure World).
 * Break before make is sufficient when changing the PBHA value.
 * PBHA values used by a page can be changed independently without further side
   effects.
 * Save/restoring the page contents via a PBHA=0 mapping does not corrupt the
   values once a non-zero PBHA mapping is re-created.
 * The hypervisor may clean+invalidate to the PoC via a PBHA=0 mapping prior to
   save/restore to cleanup mismatched attributes. This does not corrupt the
   values after save/restore once a non-zero PBHA mapping is re-created.
 * Cache maintenance via a PBHA=0 mapping to prevent stale data being visible
   when mismatched attributes occur is sufficient even if the subsequent
   mapping has a non-zero PBHA value.
 * The OS/hypervisor can clean-up a page by removing all non-zero PBHA mappings,
   then writing new data via PBHA=0 mapping of the same type, shareability and
   cacheability. After this, only the new data is visible for data accesses.
 * For instruction-fetch, the same maintenance as would be performed against a
   PBHA=0 page is sufficient. (which with DIC+IDC, may be none at all).
 * The behaviour enabled by PBHA should not depend on the size of the access, or
   whether other SoC hardware under the control of the OS is enabled and
   configured.
 * EL2 is able to at least force stage1 PBHA bits to zero.
+537 −0
Original line number Diff line number Diff line
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/arm/cpu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#

title: ARM CPUs bindings

maintainers:
  - Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

description: |+
  The device tree allows to describe the layout of CPUs in a system through
  the "cpus" node, which in turn contains a number of subnodes (ie "cpu")
  defining properties for every cpu.

  Bindings for CPU nodes follow the Devicetree Specification, available from:

  https://www.devicetree.org/specifications/

  with updates for 32-bit and 64-bit ARM systems provided in this document.

  ================================
  Convention used in this document
  ================================

  This document follows the conventions described in the Devicetree
  Specification, with the addition:

  - square brackets define bitfields, eg reg[7:0] value of the bitfield in
    the reg property contained in bits 7 down to 0

  =====================================
  cpus and cpu node bindings definition
  =====================================

  The ARM architecture, in accordance with the Devicetree Specification,
  requires the cpus and cpu nodes to be present and contain the properties
  described below.

properties:
  reg:
    maxItems: 1
    description: |
      Usage and definition depend on ARM architecture version and
      configuration:

      On uniprocessor ARM architectures previous to v7
      this property is required and must be set to 0.

      On ARM 11 MPcore based systems this property is
        required and matches the CPUID[11:0] register bits.

        Bits [11:0] in the reg cell must be set to
        bits [11:0] in CPU ID register.

        All other bits in the reg cell must be set to 0.

      On 32-bit ARM v7 or later systems this property is
        required and matches the CPU MPIDR[23:0] register
        bits.

        Bits [23:0] in the reg cell must be set to
        bits [23:0] in MPIDR.

        All other bits in the reg cell must be set to 0.

      On ARM v8 64-bit systems this property is required
        and matches the MPIDR_EL1 register affinity bits.

        * If cpus node's #address-cells property is set to 2

          The first reg cell bits [7:0] must be set to
          bits [39:32] of MPIDR_EL1.

          The second reg cell bits [23:0] must be set to
          bits [23:0] of MPIDR_EL1.

        * If cpus node's #address-cells property is set to 1

          The reg cell bits [23:0] must be set to bits [23:0]
          of MPIDR_EL1.

      All other bits in the reg cells must be set to 0.

  compatible:
    enum:
      - arm,arm710t
      - arm,arm720t
      - arm,arm740t
      - arm,arm7ej-s
      - arm,arm7tdmi
      - arm,arm7tdmi-s
      - arm,arm9es
      - arm,arm9ej-s
      - arm,arm920t
      - arm,arm922t
      - arm,arm925
      - arm,arm926e-s
      - arm,arm926ej-s
      - arm,arm940t
      - arm,arm946e-s
      - arm,arm966e-s
      - arm,arm968e-s
      - arm,arm9tdmi
      - arm,arm1020e
      - arm,arm1020t
      - arm,arm1022e
      - arm,arm1026ej-s
      - arm,arm1136j-s
      - arm,arm1136jf-s
      - arm,arm1156t2-s
      - arm,arm1156t2f-s
      - arm,arm1176jzf
      - arm,arm1176jz-s
      - arm,arm1176jzf-s
      - arm,arm11mpcore
      - arm,armv8 # Only for s/w models
      - arm,cortex-a5
      - arm,cortex-a7
      - arm,cortex-a8
      - arm,cortex-a9
      - arm,cortex-a12
      - arm,cortex-a15
      - arm,cortex-a17
      - arm,cortex-a32
      - arm,cortex-a34
      - arm,cortex-a35
      - arm,cortex-a53
      - arm,cortex-a55
      - arm,cortex-a57
      - arm,cortex-a65
      - arm,cortex-a72
      - arm,cortex-a73
      - arm,cortex-a75
      - arm,cortex-a76
      - arm,cortex-a77
      - arm,cortex-m0
      - arm,cortex-m0+
      - arm,cortex-m1
      - arm,cortex-m3
      - arm,cortex-m4
      - arm,cortex-r4
      - arm,cortex-r5
      - arm,cortex-r7
      - arm,neoverse-e1
      - arm,neoverse-n1
      - brcm,brahma-b15
      - brcm,brahma-b53
      - brcm,vulcan
      - cavium,thunder
      - cavium,thunder2
      - faraday,fa526
      - intel,sa110
      - intel,sa1100
      - marvell,feroceon
      - marvell,mohawk
      - marvell,pj4a
      - marvell,pj4b
      - marvell,sheeva-v5
      - marvell,sheeva-v7
      - nvidia,tegra132-denver
      - nvidia,tegra186-denver
      - nvidia,tegra194-carmel
      - qcom,krait
      - qcom,kryo
      - qcom,kryo260
      - qcom,kryo280
      - qcom,kryo385
      - qcom,kryo468
      - qcom,kryo485
      - qcom,scorpion

  enable-method:
    $ref: '/schemas/types.yaml#/definitions/string'
    oneOf:
      # On ARM v8 64-bit this property is required
      - enum:
          - psci
          - spin-table
      # On ARM 32-bit systems this property is optional
      - enum:
          - actions,s500-smp
          - allwinner,sun6i-a31
          - allwinner,sun8i-a23
          - allwinner,sun9i-a80-smp
          - allwinner,sun8i-a83t-smp
          - amlogic,meson8-smp
          - amlogic,meson8b-smp
          - arm,realview-smp
          - aspeed,ast2600-smp
          - brcm,bcm11351-cpu-method
          - brcm,bcm23550
          - brcm,bcm2836-smp
          - brcm,bcm63138
          - brcm,bcm-nsp-smp
          - brcm,brahma-b15
          - marvell,armada-375-smp
          - marvell,armada-380-smp
          - marvell,armada-390-smp
          - marvell,armada-xp-smp
          - marvell,98dx3236-smp
          - marvell,mmp3-smp
          - mediatek,mt6589-smp
          - mediatek,mt81xx-tz-smp
          - qcom,gcc-msm8660
          - qcom,kpss-acc-v1
          - qcom,kpss-acc-v2
          - renesas,apmu
          - renesas,r9a06g032-smp
          - rockchip,rk3036-smp
          - rockchip,rk3066-smp
          - socionext,milbeaut-m10v-smp
          - ste,dbx500-smp
          - ti,am3352
          - ti,am4372

  cpu-release-addr:
    $ref: '/schemas/types.yaml#/definitions/uint64'

    description:
      Required for systems that have an "enable-method"
        property value of "spin-table".
      On ARM v8 64-bit systems must be a two cell
        property identifying a 64-bit zero-initialised
        memory location.

  cpu-idle-states:
    $ref: '/schemas/types.yaml#/definitions/phandle-array'
    description: |
      List of phandles to idle state nodes supported
      by this cpu (see ./idle-states.yaml).

  capacity-dmips-mhz:
    $ref: '/schemas/types.yaml#/definitions/uint32'
    description:
      u32 value representing CPU capacity (see ./cpu-capacity.txt) in
      DMIPS/MHz, relative to highest capacity-dmips-mhz
      in the system.

  dynamic-power-coefficient:
    $ref: '/schemas/types.yaml#/definitions/uint32'
    description:
      A u32 value that represents the running time dynamic
      power coefficient in units of uW/MHz/V^2. The
      coefficient can either be calculated from power
      measurements or derived by analysis.

      The dynamic power consumption of the CPU  is
      proportional to the square of the Voltage (V) and
      the clock frequency (f). The coefficient is used to
      calculate the dynamic power as below -

      Pdyn = dynamic-power-coefficient * V^2 * f

      where voltage is in V, frequency is in MHz.

  power-domains:
    $ref: '/schemas/types.yaml#/definitions/phandle-array'
    description:
      List of phandles and PM domain specifiers, as defined by bindings of the
      PM domain provider (see also ../power_domain.txt).

  power-domain-names:
    $ref: '/schemas/types.yaml#/definitions/string-array'
    description:
      A list of power domain name strings sorted in the same order as the
      power-domains property.

      For PSCI based platforms, the name corresponding to the index of the PSCI
      PM domain provider, must be "psci".

  qcom,saw:
    $ref: '/schemas/types.yaml#/definitions/phandle'
    description: |
      Specifies the SAW* node associated with this CPU.

      Required for systems that have an "enable-method" property
      value of "qcom,kpss-acc-v1" or "qcom,kpss-acc-v2"

      * arm/msm/qcom,saw2.txt

  qcom,acc:
    $ref: '/schemas/types.yaml#/definitions/phandle'
    description: |
      Specifies the ACC* node associated with this CPU.

      Required for systems that have an "enable-method" property
      value of "qcom,kpss-acc-v1" or "qcom,kpss-acc-v2"

      * arm/msm/qcom,kpss-acc.txt

  rockchip,pmu:
    $ref: '/schemas/types.yaml#/definitions/phandle'
    description: |
      Specifies the syscon node controlling the cpu core power domains.

      Optional for systems that have an "enable-method"
      property value of "rockchip,rk3066-smp"
      While optional, it is the preferred way to get access to
      the cpu-core power-domains.

  secondary-boot-reg:
    $ref: '/schemas/types.yaml#/definitions/uint32'
    description: |
      Required for systems that have an "enable-method" property value of
      "brcm,bcm11351-cpu-method", "brcm,bcm23550" or "brcm,bcm-nsp-smp".

      This includes the following SoCs: |
      BCM11130, BCM11140, BCM11351, BCM28145, BCM28155, BCM21664, BCM23550
      BCM58522, BCM58525, BCM58535, BCM58622, BCM58623, BCM58625, BCM88312

      The secondary-boot-reg property is a u32 value that specifies the
      physical address of the register used to request the ROM holding pen
      code release a secondary CPU. The value written to the register is
      formed by encoding the target CPU id into the low bits of the
      physical start address it should jump to.

if:
  # If the enable-method property contains one of those values
  properties:
    enable-method:
      contains:
        enum:
          - brcm,bcm11351-cpu-method
          - brcm,bcm23550
          - brcm,bcm-nsp-smp
  # and if enable-method is present
  required:
    - enable-method

then:
  required:
    - secondary-boot-reg

required:
  - device_type
  - reg
  - compatible

dependencies:
  rockchip,pmu: [enable-method]

additionalProperties: true

examples:
  - |
    cpus {
      #size-cells = <0>;
      #address-cells = <1>;

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x0>;
      };

      cpu@1 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x1>;
      };

      cpu@100 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x100>;
      };

      cpu@101 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x101>;
      };
    };

  - |
    // Example 2 (Cortex-A8 uniprocessor 32-bit system):
    cpus {
      #size-cells = <0>;
      #address-cells = <1>;

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,cortex-a8";
        reg = <0x0>;
      };
    };

  - |
    // Example 3 (ARM 926EJ-S uniprocessor 32-bit system):
    cpus {
      #size-cells = <0>;
      #address-cells = <1>;

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,arm926ej-s";
        reg = <0x0>;
      };
    };

  - |
    //  Example 4 (ARM Cortex-A57 64-bit system):
    cpus {
      #size-cells = <0>;
      #address-cells = <2>;

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x0>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@1 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x1>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x100>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@101 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x101>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@10000 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x10000>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@10001 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x10001>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@10100 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x10100>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@10101 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x0 0x10101>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100000000 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x0>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100000001 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x1>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100000100 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x100>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100000101 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x101>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100010000 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x10000>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100010001 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x10001>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100010100 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x10100>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };

      cpu@100010101 {
        device_type = "cpu";
        compatible = "arm,cortex-a57";
        reg = <0x1 0x10101>;
        enable-method = "spin-table";
        cpu-release-addr = <0 0x20000000>;
      };
    };
...
+69 −515

File changed.

Preview size limit exceeded, changes collapsed.

Loading