Commit a1e1edde authored by Catalin Marinas's avatar Catalin Marinas
Browse files

Merge branches 'for-next/misc', 'for-next/kselftest', 'for-next/xntable',...

Merge branches 'for-next/misc', 'for-next/kselftest', 'for-next/xntable', 'for-next/vdso', 'for-next/fiq', 'for-next/epan', 'for-next/kasan-vmalloc', 'for-next/fgt-boot-init', 'for-next/vhe-only' and 'for-next/neon-softirqs-disabled', remote-tracking branch 'arm64/for-next/perf' into for-next/core

* for-next/misc:
  : Miscellaneous patches
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: mte: Remove unused mte_assign_mem_tag_range()
  arm64: Add __init section marker to some functions
  arm64/sve: Rework SVE access trap to convert state in registers
  docs: arm64: Fix a grammar error
  arm64: smp: Add missing prototype for some smp.c functions
  arm64: setup: name `tcr` register
  arm64: setup: name `mair` register
  arm64: stacktrace: Move start_backtrace() out of the header
  arm64: barrier: Remove spec_bar() macro
  arm64: entry: remove test_irqs_unmasked macro
  ARM64: enable GENERIC_FIND_FIRST_BIT
  arm64: defconfig: Use DEBUG_INFO_REDUCED

* for-next/kselftest:
  : Various kselftests for arm64
  kselftest: arm64: Add BTI tests
  kselftest/arm64: mte: Report filename on failing temp file creation
  kselftest/arm64: mte: Fix clang warning
  kselftest/arm64: mte: Makefile: Fix clang compilation
  kselftest/arm64: mte: Output warning about failing compiler
  kselftest/arm64: mte: Use cross-compiler if specified
  kselftest/arm64: mte: Fix MTE feature detection
  kselftest/arm64: mte: common: Fix write() warnings
  kselftest/arm64: mte: user_mem: Fix write() warning
  kselftest/arm64: mte: ksm_options: Fix fscanf warning
  kselftest/arm64: mte: Fix pthread linking
  kselftest/arm64: mte: Fix compilation with native compiler

* for-next/xntable:
  : Add hierarchical XN permissions for all page tables
  arm64: mm: use XN table mapping attributes for user/kernel mappings
  arm64: mm: use XN table mapping attributes for the linear region
  arm64: mm: add missing P4D definitions and use them consistently

* for-next/vdso:
  : Minor improvements to the compat vdso and sigpage
  arm64: compat: Poison the compat sigpage
  arm64: vdso: Avoid ISB after reading from cntvct_el0
  arm64: compat: Allow signal page to be remapped
  arm64: vdso: Remove redundant calls to flush_dcache_page()
  arm64: vdso: Use GFP_KERNEL for allocating compat vdso and signal pages

* for-next/fiq:
  : Support arm64 FIQ controller registration
  arm64: irq: allow FIQs to be handled
  arm64: Always keep DAIF.[IF] in sync
  arm64: entry: factor irq triage logic into macros
  arm64: irq: rework root IRQ handler registration
  arm64: don't use GENERIC_IRQ_MULTI_HANDLER
  genirq: Allow architectures to override set_handle_irq() fallback

* for-next/epan:
  : Support for Enhanced PAN (execute-only permissions)
  arm64: Support execute-only permissions with Enhanced PAN

* for-next/kasan-vmalloc:
  : Support CONFIG_KASAN_VMALLOC on arm64
  arm64: Kconfig: select KASAN_VMALLOC if KANSAN_GENERIC is enabled
  arm64: kaslr: support randomized module area with KASAN_VMALLOC
  arm64: Kconfig: support CONFIG_KASAN_VMALLOC
  arm64: kasan: abstract _text and _end to KERNEL_START/END
  arm64: kasan: don't populate vmalloc area for CONFIG_KASAN_VMALLOC

* for-next/fgt-boot-init:
  : Booting clarifications and fine grained traps setup
  arm64: Require that system registers at all visible ELs be initialized
  arm64: Disable fine grained traps on boot
  arm64: Document requirements for fine grained traps at boot

* for-next/vhe-only:
  : Dealing with VHE-only CPUs (a.k.a. M1)
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  arm64: cpufeature: Allow early filtering of feature override

* arm64/for-next/perf:
  arm64: perf: Remove redundant initialization in perf_event.c
  perf/arm_pmu_platform: Clean up with dev_printk
  perf/arm_pmu_platform: Fix error handling
  perf/arm_pmu_platform: Use dev_err_probe() for IRQ errors
  docs: perf: Address some html build warnings
  docs: perf: Add new description on HiSilicon uncore PMU v2
  drivers/perf: hisi: Add support for HiSilicon PA PMU driver
  drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver
  drivers/perf: hisi: Update DDRC PMU for programmable counter
  drivers/perf: hisi: Add new functions for HHA PMU
  drivers/perf: hisi: Add new functions for L3C PMU
  drivers/perf: hisi: Add PMU version for uncore PMU drivers.
  drivers/perf: hisi: Refactor code for more uncore PMUs
  drivers/perf: hisi: Remove unnecessary check of counter index
  drivers/perf: Simplify the SMMUv3 PMU event attributes
  drivers/perf: convert sysfs sprintf family to sysfs_emit
  drivers/perf: convert sysfs scnprintf family to sysfs_emit_at() and sysfs_emit()
  drivers/perf: convert sysfs snprintf family to sysfs_emit

* for-next/neon-softirqs-disabled:
  : Run kernel mode SIMD with softirqs disabled
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
Loading
+1 −2
Original line number Diff line number Diff line
@@ -2279,8 +2279,7 @@
				   state is kept private from the host.
				   Not valid if the kernel is running in EL2.

			Defaults to VHE/nVHE based on hardware support and
			the value of CONFIG_ARM64_VHE.
			Defaults to VHE/nVHE based on hardware support.

	kvm-arm.vgic_v3_group0_trap=
			[KVM,ARM] Trap guest accesses to GICv3 group-0
+54 −0
Original line number Diff line number Diff line
@@ -53,6 +53,60 @@ Example usage of perf::
  $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5

For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
as PMU v1, but some new functions are added to the hardware.

(a) L3C PMU supports filtering by core/thread within the cluster which can be
specified as a bitmap::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5

This will only count the operations from core/thread 0 and 1 in this cluster.

(b) Tracetag allow the user to chose to count only read, write or atomic
operations via the tt_req parameeter in perf. The default value counts all
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
represents write operations, 3'b110 represents atomic store operations and
3'b111 represents atomic non-store operations, other values are reserved::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5

This will only count the read operations in this cluster.

(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
Some important codes are as follows:
5'b00001: comes from L3C in this die;
5'b01000: comes from L3C in the cross-die;
5'b01001: comes from L3C which is in another socket;
5'b01110: comes from the local DDR;
5'b01111: comes from the cross-die DDR;
5'b10000: comes from cross-socket DDR;
etc, it is mainly helpful to find that the data source is nearest from the CPU
cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
configured in perf command::

  $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
  hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5

(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
5'b00000: I/O_MGMT_ICL;
5'b00001: Network_ICL;
5'b00011: HAC_ICL;
5'b10000: PCIe_ICL;

Users could configure IDs to count data come from specific CCL/ICL, by setting
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
check the bit when matching against the srcid_cmd/tgtid_cmd.

If all of these options are disabled, it can works by the default value that
doesn't distinguish the filter condition and ID information and will return
the total counter values in the PMU counters.

The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported as the events are all uncore.

+10 −3
Original line number Diff line number Diff line
@@ -202,9 +202,10 @@ Before jumping into the kernel, the following conditions must be met:

- System registers

  All writable architected system registers at the exception level where
  the kernel image will be entered must be initialised by software at a
  higher exception level to prevent execution in an UNKNOWN state.
  All writable architected system registers at or below the exception
  level where the kernel image will be entered must be initialised by
  software at a higher exception level to prevent execution in an UNKNOWN
  state.

  - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
    executing on.
@@ -270,6 +271,12 @@ Before jumping into the kernel, the following conditions must be met:
      having 0b1 set for the corresponding bit for each of the auxiliary
      counters present.

  For CPUs with the Fine Grained Traps (FEAT_FGT) extension present:

  - If EL3 is present and the kernel is entered at EL2:

    - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1.

The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs.  All CPUs must
enter the kernel in the same exception level.
+19 −21
Original line number Diff line number Diff line
@@ -111,7 +111,6 @@ config ARM64
	select GENERIC_FIND_FIRST_BIT
	select GENERIC_IDLE_POLL_SETUP
	select GENERIC_IRQ_IPI
	select GENERIC_IRQ_MULTI_HANDLER
	select GENERIC_IRQ_PROBE
	select GENERIC_IRQ_SHOW
	select GENERIC_IRQ_SHOW_LEVEL
@@ -139,6 +138,7 @@ config ARM64
	select HAVE_ARCH_JUMP_LABEL
	select HAVE_ARCH_JUMP_LABEL_RELATIVE
	select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
	select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN
	select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
	select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE)
	select HAVE_ARCH_KFENCE
@@ -195,6 +195,7 @@ config ARM64
	select IOMMU_DMA if IOMMU_SUPPORT
	select IRQ_DOMAIN
	select IRQ_FORCED_THREADING
	select KASAN_VMALLOC if KASAN_GENERIC
	select MODULES_USE_ELF_RELA
	select NEED_DMA_MAP_STATE
	select NEED_SG_DMA_LENGTH
@@ -1059,6 +1060,9 @@ config SYS_SUPPORTS_HUGETLBFS
config ARCH_HAS_CACHE_LINE_SIZE
	def_bool y

config ARCH_HAS_FILTER_PGPROT
	def_bool y

config ARCH_ENABLE_SPLIT_PMD_PTLOCK
	def_bool y if PGTABLE_LEVELS > 2

@@ -1417,19 +1421,6 @@ config ARM64_USE_LSE_ATOMICS
	  built with binutils >= 2.25 in order for the new instructions
	  to be used.

config ARM64_VHE
	bool "Enable support for Virtualization Host Extensions (VHE)"
	default y
	help
	  Virtualization Host Extensions (VHE) allow the kernel to run
	  directly at EL2 (instead of EL1) on processors that support
	  it. This leads to better performance for KVM, as they reduce
	  the cost of the world switch.

	  Selecting this option allows the VHE feature to be detected
	  at runtime, and does not affect processors that do not
	  implement this feature.

endmenu

menu "ARMv8.2 architectural features"
@@ -1682,10 +1673,23 @@ config ARM64_MTE

endmenu

menu "ARMv8.7 architectural features"

config ARM64_EPAN
	bool "Enable support for Enhanced Privileged Access Never (EPAN)"
	default y
	depends on ARM64_PAN
	help
	 Enhanced Privileged Access Never (EPAN) allows Privileged
	 Access Never to be used with Execute-only mappings.

	 The feature is detected at runtime, and will remain disabled
	 if the cpu does not implement the feature.
endmenu

config ARM64_SVE
	bool "ARM Scalable Vector Extension support"
	default y
	depends on !KVM || ARM64_VHE
	help
	  The Scalable Vector Extension (SVE) is an extension to the AArch64
	  execution state which complements and extends the SIMD functionality
@@ -1714,12 +1718,6 @@ config ARM64_SVE
	  booting the kernel.  If unsure and you are not observing these
	  symptoms, you should assume that it is safe to say Y.

	  CPUs that support SVE are architecturally required to support the
	  Virtualization Host Extensions (VHE), so the kernel makes no
	  provision for supporting SVE alongside KVM without VHE enabled.
	  Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
	  KVM in the same kernel image.

config ARM64_MODULE_PLTS
	bool "Use PLTs to allow module memory to spill over into vmalloc area"
	depends on MODULES
+1 −1
Original line number Diff line number Diff line
@@ -700,7 +700,7 @@ AES_FUNC_START(aes_mac_update)
	cbz		w5, .Lmacout
	encrypt_block	v0, w2, x1, x7, w8
	st1		{v0.16b}, [x4]			/* return dg */
	cond_yield	.Lmacout, x7
	cond_yield	.Lmacout, x7, x8
	b		.Lmacloop4x
.Lmac1x:
	add		w3, w3, #4
Loading