Commit 92783a90 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix the PMCR_EL0 reset value after the PMU rework

   - Correctly handle S2 fault triggered by a S1 page table walk by not
     always classifying it as a write, as this breaks on R/O memslots

   - Document why we cannot exit with KVM_EXIT_MMIO when taking a write
     fault from a S1 PTW on a R/O memslot

   - Put the Apple M2 on the naughty list for not being able to
     correctly implement the vgic SEIS feature, just like the M1 before
     it

   - Reviewer updates: Alex is stepping down, replaced by Zenghui

  x86:

   - Fix various rare locking issues in Xen emulation and teach lockdep
     to detect them

   - Documentation improvements

   - Do not return host topology information from KVM_GET_SUPPORTED_CPUID"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/xen: Avoid deadlock by adding kvm->arch.xen.xen_lock leaf node lock
  KVM: Ensure lockdep knows about kvm->lock vs. vcpu->mutex ordering rule
  KVM: x86/xen: Fix potential deadlock in kvm_xen_update_runstate_guest()
  KVM: x86/xen: Fix lockdep warning on "recursive" gpc locking
  Documentation: kvm: fix SRCU locking order docs
  KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID
  KVM: nSVM: clarify recalc_intercepts() wrt CR8
  MAINTAINERS: Remove myself as a KVM/arm64 reviewer
  MAINTAINERS: Add Zenghui Yu as a KVM/arm64 reviewer
  KVM: arm64: vgic: Add Apple M2 cpus to the list of broken SEIS implementations
  KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_*
  KVM: arm64: Document the behaviour of S1PTW faults on RO memslots
  KVM: arm64: Fix S1PTW handling on RO memslots
  KVM: arm64: PMU: Fix PMCR_EL0 reset value
parents f5fe24ef 310bc395
Loading
Loading
Loading
Loading
+22 −0
Original line number Diff line number Diff line
@@ -1354,6 +1354,14 @@ the memory region are automatically reflected into the guest. For example, an
mmap() that affects the region will be made visible immediately.  Another
example is madvise(MADV_DROP).

Note: On arm64, a write generated by the page-table walker (to update
the Access and Dirty flags, for example) never results in a
KVM_EXIT_MMIO exit when the slot has the KVM_MEM_READONLY flag. This
is because KVM cannot provide the data that would be written by the
page-table walker, making it impossible to emulate the access.
Instead, an abort (data abort if the cause of the page-table update
was a load or a store, instruction abort if it was an instruction
fetch) is injected in the guest.

4.36 KVM_SET_TSS_ADDR
---------------------
@@ -8310,6 +8318,20 @@ CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``
It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel
has enabled in-kernel emulation of the local APIC.

CPU topology
~~~~~~~~~~~~

Several CPUID values include topology information for the host CPU:
0x0b and 0x1f for Intel systems, 0x8000001e for AMD systems.  Different
versions of KVM return different values for this information and userspace
should not rely on it.  Currently they return all zeroes.

If userspace wishes to set up a guest topology, it should be careful that
the values of these three leaves differ for each CPU.  In particular,
the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX
for 0x8000001e; the latter also encodes the core id and node id in bits
7:0 of EBX and ECX respectively.

Obsolete ioctls and capabilities
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

+13 −12
Original line number Diff line number Diff line
@@ -24,21 +24,22 @@ The acquisition orders for mutexes are as follows:

For SRCU:

- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
  the kvm->slots_lock critical section, therefore kvm->slots_lock
  cannot be taken inside a kvm->srcu read-side critical section.
  Instead, kvm->slots_arch_lock is released before the call
  to ``synchronize_srcu()`` and _can_ be taken inside a
  kvm->srcu read-side critical section.

- kvm->lock is taken inside kvm->srcu, therefore
  ``synchronize_srcu(&kvm->srcu)`` cannot be called inside
  a kvm->lock critical section.  If you cannot delay the
  call until after kvm->lock is released, use ``call_srcu``.
- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
  for kvm->lock, vcpu->mutex and kvm->slots_lock.  These locks _cannot_
  be taken inside a kvm->srcu read-side critical section; that is, the
  following is broken::

      srcu_read_lock(&kvm->srcu);
      mutex_lock(&kvm->slots_lock);

- kvm->slots_arch_lock instead is released before the call to
  ``synchronize_srcu()``.  It _can_ therefore be taken inside a
  kvm->srcu read-side critical section, for example while processing
  a vmexit.

On x86:

- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock and kvm->arch.xen.xen_lock

- kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock and
  kvm->arch.mmu_unsync_pages_lock are taken inside kvm->arch.mmu_lock, and
+1 −1
Original line number Diff line number Diff line
@@ -11356,9 +11356,9 @@ F: virt/kvm/*
KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
M:	Marc Zyngier <maz@kernel.org>
R:	James Morse <james.morse@arm.com>
R:	Alexandru Elisei <alexandru.elisei@arm.com>
R:	Suzuki K Poulose <suzuki.poulose@arm.com>
R:	Oliver Upton <oliver.upton@linux.dev>
R:	Zenghui Yu <yuzenghui@huawei.com>
L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L:	kvmarm@lists.linux.dev
L:	kvmarm@lists.cs.columbia.edu (deprecated, moderated for non-subscribers)
+4 −0
Original line number Diff line number Diff line
@@ -124,6 +124,8 @@
#define APPLE_CPU_PART_M1_FIRESTORM_PRO	0x025
#define APPLE_CPU_PART_M1_ICESTORM_MAX	0x028
#define APPLE_CPU_PART_M1_FIRESTORM_MAX	0x029
#define APPLE_CPU_PART_M2_BLIZZARD	0x032
#define APPLE_CPU_PART_M2_AVALANCHE	0x033

#define AMPERE_CPU_PART_AMPERE1		0xAC3

@@ -177,6 +179,8 @@
#define MIDR_APPLE_M1_FIRESTORM_PRO MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM_PRO)
#define MIDR_APPLE_M1_ICESTORM_MAX MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM_MAX)
#define MIDR_APPLE_M1_FIRESTORM_MAX MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM_MAX)
#define MIDR_APPLE_M2_BLIZZARD MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M2_BLIZZARD)
#define MIDR_APPLE_M2_AVALANCHE MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M2_AVALANCHE)
#define MIDR_AMPERE1 MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1)

/* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
+9 −0
Original line number Diff line number Diff line
@@ -114,6 +114,15 @@
#define ESR_ELx_FSC_ACCESS	(0x08)
#define ESR_ELx_FSC_FAULT	(0x04)
#define ESR_ELx_FSC_PERM	(0x0C)
#define ESR_ELx_FSC_SEA_TTW0	(0x14)
#define ESR_ELx_FSC_SEA_TTW1	(0x15)
#define ESR_ELx_FSC_SEA_TTW2	(0x16)
#define ESR_ELx_FSC_SEA_TTW3	(0x17)
#define ESR_ELx_FSC_SECC	(0x18)
#define ESR_ELx_FSC_SECC_TTW0	(0x1c)
#define ESR_ELx_FSC_SECC_TTW1	(0x1d)
#define ESR_ELx_FSC_SECC_TTW2	(0x1e)
#define ESR_ELx_FSC_SECC_TTW3	(0x1f)

/* ISS field definitions for Data Aborts */
#define ESR_ELx_ISV_SHIFT	(24)
Loading