Commit c99ad25b authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvm-x86-6.1-2' of https://github.com/sean-jc/linux into HEAD

KVM x86 updates for 6.1, batch #2:

 - Misc PMU fixes and cleanups.

 - Fixes for Hyper-V hypercall selftest
parents 458e9874 ea5cbc9f
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -333,6 +333,7 @@ Oleksij Rempel <linux@rempel-privat.de> <external.Oleksij.Rempel@de.bosch.com>
Oleksij Rempel <linux@rempel-privat.de> <fixed-term.Oleksij.Rempel@de.bosch.com>
Oleksij Rempel <linux@rempel-privat.de> <o.rempel@pengutronix.de>
Oleksij Rempel <linux@rempel-privat.de> <ore@pengutronix.de>
Oliver Upton <oliver.upton@linux.dev> <oupton@google.com>
Pali Rohár <pali@kernel.org> <pali.rohar@gmail.com>
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Patrick Mochel <mochel@digitalimplant.org>
+5 −0
Original line number Diff line number Diff line
@@ -1355,6 +1355,11 @@ PAGE_SIZE multiple when read back.
	  pagetables
                Amount of memory allocated for page tables.

	  sec_pagetables
		Amount of memory allocated for secondary page tables,
		this currently includes KVM mmu allocations on x86
		and arm64.

	  percpu (npn)
		Amount of memory used for storing per-cpu kernel
		data structures.
+4 −0
Original line number Diff line number Diff line
@@ -982,6 +982,7 @@ Example output. You may not have all of these fields.
    SUnreclaim:       142336 kB
    KernelStack:       11168 kB
    PageTables:        20540 kB
    SecPageTables:         0 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
@@ -1090,6 +1091,9 @@ KernelStack
              Memory consumed by the kernel stacks of all tasks
PageTables
              Memory consumed by userspace page tables
SecPageTables
              Memory consumed by secondary page tables, this currently
              currently includes KVM mmu allocations on x86 and arm64.
NFS_Unstable
              Always zero. Previous counted pages which had been written to
              the server, but has not been committed to stable storage.
+6 −107
Original line number Diff line number Diff line
@@ -4074,7 +4074,7 @@ Queues an SMI on the thread's vcpu.
4.97 KVM_X86_SET_MSR_FILTER
----------------------------

:Capability: KVM_X86_SET_MSR_FILTER
:Capability: KVM_CAP_X86_MSR_FILTER
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_msr_filter
@@ -4173,8 +4173,10 @@ If an MSR access is not permitted through the filtering, it generates a
allows user space to deflect and potentially handle various MSR accesses
into user space.

If a vCPU is in running state while this ioctl is invoked, the vCPU may
experience inconsistent filtering behavior on MSR accesses.
Note, invoking this ioctl while a vCPU is running is inherently racy.  However,
KVM does guarantee that vCPUs will see either the previous filter or the new
filter, e.g. MSRs with identical settings in both the old and new filter will
have deterministic behavior.

4.98 KVM_CREATE_SPAPR_TCE_64
----------------------------
@@ -5287,110 +5289,7 @@ KVM_PV_DUMP
    authentication tag all of which are needed to decrypt the dump at a
    later time.


4.126 KVM_X86_SET_MSR_FILTER
----------------------------

:Capability: KVM_CAP_X86_MSR_FILTER
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_msr_filter
:Returns: 0 on success, < 0 on error

::

  struct kvm_msr_filter_range {
  #define KVM_MSR_FILTER_READ  (1 << 0)
  #define KVM_MSR_FILTER_WRITE (1 << 1)
	__u32 flags;
	__u32 nmsrs; /* number of msrs in bitmap */
	__u32 base;  /* MSR index the bitmap starts at */
	__u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
  };

  #define KVM_MSR_FILTER_MAX_RANGES 16
  struct kvm_msr_filter {
  #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0)
  #define KVM_MSR_FILTER_DEFAULT_DENY  (1 << 0)
	__u32 flags;
	struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES];
  };

flags values for ``struct kvm_msr_filter_range``:

``KVM_MSR_FILTER_READ``

  Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
  indicates that a read should immediately fail, while a 1 indicates that
  a read for a particular MSR should be handled regardless of the default
  filter action.

``KVM_MSR_FILTER_WRITE``

  Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
  indicates that a write should immediately fail, while a 1 indicates that
  a write for a particular MSR should be handled regardless of the default
  filter action.

``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE``

  Filter both read and write accesses to MSRs using the given bitmap. A 0
  in the bitmap indicates that both reads and writes should immediately fail,
  while a 1 indicates that reads and writes for a particular MSR are not
  filtered by this range.

flags values for ``struct kvm_msr_filter``:

``KVM_MSR_FILTER_DEFAULT_ALLOW``

  If no filter range matches an MSR index that is getting accessed, KVM will
  fall back to allowing access to the MSR.

``KVM_MSR_FILTER_DEFAULT_DENY``

  If no filter range matches an MSR index that is getting accessed, KVM will
  fall back to rejecting access to the MSR. In this mode, all MSRs that should
  be processed by KVM need to explicitly be marked as allowed in the bitmaps.

This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
specify whether a certain MSR access should be explicitly filtered for or not.

If this ioctl has never been invoked, MSR accesses are not guarded and the
default KVM in-kernel emulation behavior is fully preserved.

Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
an error.

As soon as the filtering is in place, every MSR access is processed through
the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
register.

If a bit is within one of the defined ranges, read and write accesses are
guarded by the bitmap's value for the MSR index if the kind of access
is included in the ``struct kvm_msr_filter_range`` flags.  If no range
cover this particular access, the behavior is determined by the flags
field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW``
and ``KVM_MSR_FILTER_DEFAULT_DENY``.

Each bitmap range specifies a range of MSRs to potentially allow access on.
The range goes from MSR index [base .. base+nmsrs]. The flags field
indicates whether reads, writes or both reads and writes are filtered
by setting a 1 bit in the bitmap for the corresponding MSR index.

If an MSR access is not permitted through the filtering, it generates a
#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
allows user space to deflect and potentially handle various MSR accesses
into user space.

Note, invoking this ioctl with a vCPU is running is inherently racy.  However,
KVM does guarantee that vCPUs will see either the previous filter or the new
filter, e.g. MSRs with identical settings in both the old and new filter will
have deterministic behavior.

4.127 KVM_XEN_HVM_SET_ATTR
4.126 KVM_XEN_HVM_SET_ATTR
--------------------------

:Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
+1 −27
Original line number Diff line number Diff line
@@ -97,7 +97,7 @@ VCPU requests are simply bit indices of the ``vcpu->requests`` bitmap.
This means general bitops, like those documented in [atomic-ops]_ could
also be used, e.g. ::

  clear_bit(KVM_REQ_UNHALT & KVM_REQUEST_MASK, &vcpu->requests);
  clear_bit(KVM_REQ_UNBLOCK & KVM_REQUEST_MASK, &vcpu->requests);

However, VCPU request users should refrain from doing so, as it would
break the abstraction.  The first 8 bits are reserved for architecture
@@ -126,17 +126,6 @@ KVM_REQ_UNBLOCK
  or in order to update the interrupt routing and ensure that assigned
  devices will wake up the vCPU.

KVM_REQ_UNHALT

  This request may be made from the KVM common function kvm_vcpu_block(),
  which is used to emulate an instruction that causes a CPU to halt until
  one of an architectural specific set of events and/or interrupts is
  received (determined by checking kvm_arch_vcpu_runnable()).  When that
  event or interrupt arrives kvm_vcpu_block() makes the request.  This is
  in contrast to when kvm_vcpu_block() returns due to any other reason,
  such as a pending signal, which does not indicate the VCPU's halt
  emulation should stop, and therefore does not make the request.

KVM_REQ_OUTSIDE_GUEST_MODE

  This "request" ensures the target vCPU has exited guest mode prior to the
@@ -297,21 +286,6 @@ architecture dependent. kvm_vcpu_block() calls kvm_arch_vcpu_runnable()
to check if it should awaken.  One reason to do so is to provide
architectures a function where requests may be checked if necessary.

Clearing Requests
-----------------

Generally it only makes sense for the receiving VCPU thread to clear a
request.  However, in some circumstances, such as when the requesting
thread and the receiving VCPU thread are executed serially, such as when
they are the same thread, or when they are using some form of concurrency
control to temporarily execute synchronously, then it's possible to know
that the request may be cleared immediately, rather than waiting for the
receiving VCPU thread to handle the request in VCPU RUN.  The only current
examples of this are kvm_vcpu_block() calls made by VCPUs to block
themselves.  A possible side-effect of that call is to make the
KVM_REQ_UNHALT request, which may then be cleared immediately when the
VCPU returns from the call.

References
==========

Loading