Commit 0cbdfd9b authored by Tao Xu's avatar Tao Xu Committed by Aichun Shi
Browse files

KVM: VMX: Enable Notify VM exit

mainline inclusion
from mainline-v6.0-rc1
commit 2f4073e0
category: feature
feature: Notify VM exit
bugzilla: https://gitee.com/openeuler/intel-kernel/issues/I5PAJ5
CVE: N/A
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/


commit/?id=2f4073e0

Intel-SIG: commit 2f4073e0 ("KVM: VMX: Enable Notify VM exit")

-------------------------------------

KVM: VMX: Enable Notify VM exit

There are cases that malicious virtual machines can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs.

VMM can enable notify VM exit that a VM exit generated if no event
window occurs in VM non-root mode for a specified amount of time (notify
window).

Feature enabling:
- The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to
  enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust
  the expected notify window.
- Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space
  can query and enable this feature in per-VM scope. The argument is a
  64bit value: bits 63:32 are used for notify window, and bits 31:0 are
  for flags. Current supported flags:
  - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify
    window provided.
  - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen.
- It's safe to even set notify window to zero since an internal hardware
  threshold is added to vmcs.notify_window.

VM exit handling:
- Introduce a vcpu state notify_window_exits to records the count of
  notify VM exits and expose it through the debugfs.
- Notify VM exit can happen incident to delivery of a vector event.
  Allow it in KVM.
- Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID
  bit is set.

Nested handling
- Nested notify VM exits are not supported yet. Keep the same notify
  window control in vmcs02 as vmcs01, so that L1 can't escape the
  restriction of notify VM exits through launching L2 VM.

Notify VM exit is defined in latest Intel Architecture Instruction Set
Extensions Programming Reference, chapter 9.2.

Co-developed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: default avatarTao Xu <tao3.xu@intel.com>
Co-developed-by: default avatarChenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: default avatarChenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com>
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: default avatarAichun Shi <aichun.shi@intel.com>
parent ebf77f63
Loading
Loading
Loading
Loading
+49 −0
Original line number Diff line number Diff line
@@ -5496,6 +5496,26 @@ array field represents return values. The userspace should update the return
values of SBI call before resuming the VCPU. For more details on RISC-V SBI
spec refer, https://github.com/riscv/riscv-sbi-doc.

::

    /* KVM_EXIT_NOTIFY */
    struct {
  #define KVM_NOTIFY_CONTEXT_INVALID	(1 << 0)
      __u32 flags;
    } notify;

Used on x86 systems. When the VM capability KVM_CAP_X86_NOTIFY_VMEXIT is
enabled, a VM exit generated if no event window occurs in VM non-root mode
for a specified amount of time. Once KVM_X86_NOTIFY_VMEXIT_USER is set when
enabling the cap, it would exit to userspace with the exit reason
KVM_EXIT_NOTIFY for further handling. The "flags" field contains more
detailed info.

The valid value for 'flags' is:

  - KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid
    in VMCS. It would run into unknown result if resume the target VM.

::

		/* Fix the size of the union. */
@@ -6288,6 +6308,35 @@ the bus lock vm exit can be preempted by a higher priority VM exit, the exit
notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons.
KVM_RUN_BUS_LOCK flag is used to distinguish between them.

7.25 KVM_CAP_X86_NOTIFY_VMEXIT
------------------------------

:Architectures: x86
:Target: VM
:Parameters: args[0] is the value of notify window as well as some flags
:Returns: 0 on success, -EINVAL if args[0] contains invalid flags or notify
          VM exit is unsupported.

Bits 63:32 of args[0] are used for notify window.
Bits 31:0 of args[0] are for some flags. Valid bits are::

  #define KVM_X86_NOTIFY_VMEXIT_ENABLED    (1 << 0)
  #define KVM_X86_NOTIFY_VMEXIT_USER       (1 << 1)

This capability allows userspace to configure the notify VM exit on/off
in per-VM scope during VM creation. Notify VM exit is disabled by default.
When userspace sets KVM_X86_NOTIFY_VMEXIT_ENABLED bit in args[0], VMM will
enable this feature with the notify window provided, which will generate
a VM exit if no event window occurs in VM non-root mode for a specified of
time (notify window).

If KVM_X86_NOTIFY_VMEXIT_USER is set in args[0], upon notify VM exits happen,
KVM would exit to userspace for handling.

This capability is aimed to mitigate the threat that malicious VMs can
cause CPU stuck (due to event windows don't open up) and make the CPU
unavailable to host or other VMs.

8. Other capabilities.
======================

+9 −0
Original line number Diff line number Diff line
@@ -55,6 +55,9 @@
#define KVM_BUS_LOCK_DETECTION_VALID_MODE	(KVM_BUS_LOCK_DETECTION_OFF | \
						 KVM_BUS_LOCK_DETECTION_EXIT)

#define KVM_X86_NOTIFY_VMEXIT_VALID_BITS	(KVM_X86_NOTIFY_VMEXIT_ENABLED | \
						 KVM_X86_NOTIFY_VMEXIT_USER)

/* x86-specific vcpu->requests bit members */
#define KVM_REQ_MIGRATE_TIMER		KVM_ARCH_REQ(0)
#define KVM_REQ_REPORT_TPR_ACCESS	KVM_ARCH_REQ(1)
@@ -1001,6 +1004,9 @@ struct kvm_arch {

	bool bus_lock_detection_enabled;

	u32 notify_window;
	u32 notify_vmexit_flags;

	/* Guest can access the SGX PROVISIONKEY. */
	bool sgx_provisioning_allowed;

@@ -1093,6 +1099,7 @@ struct kvm_vcpu_stat {
	u64 preemption_reported;
	u64 preemption_other;
	u64 preemption_timer_exits;
	u64 notify_window_exits;
};

struct x86_instruction_info;
@@ -1443,6 +1450,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
extern u64  kvm_default_tsc_scaling_ratio;
/* bus lock detection supported? */
extern bool kvm_has_bus_lock_exit;
/* notify vmexit supported? */
extern bool kvm_has_notify_vmexit;

extern u64 kvm_mce_cap_supported;

+7 −0
Original line number Diff line number Diff line
@@ -76,6 +76,7 @@
#define SECONDARY_EXEC_TSC_SCALING              VMCS_CONTROL_BIT(TSC_SCALING)
#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE	VMCS_CONTROL_BIT(USR_WAIT_PAUSE)
#define SECONDARY_EXEC_BUS_LOCK_DETECTION	VMCS_CONTROL_BIT(BUS_LOCK_DETECTION)
#define SECONDARY_EXEC_NOTIFY_VM_EXITING	VMCS_CONTROL_BIT(NOTIFY_VM_EXITING)

/*
 * Definitions of Tertiary Processor-Based VM-Execution Controls.
@@ -280,6 +281,7 @@ enum vmcs_field {
	SECONDARY_VM_EXEC_CONTROL       = 0x0000401e,
	PLE_GAP                         = 0x00004020,
	PLE_WINDOW                      = 0x00004022,
	NOTIFY_WINDOW                   = 0x00004024,
	VM_INSTRUCTION_ERROR            = 0x00004400,
	VM_EXIT_REASON                  = 0x00004402,
	VM_EXIT_INTR_INFO               = 0x00004404,
@@ -566,6 +568,11 @@ enum vm_entry_failure_code {
#define EPT_VIOLATION_EXECUTABLE	(1 << EPT_VIOLATION_EXECUTABLE_BIT)
#define EPT_VIOLATION_GVA_TRANSLATED	(1 << EPT_VIOLATION_GVA_TRANSLATED_BIT)

/*
 * Exit Qualifications for NOTIFY VM EXIT
 */
#define NOTIFY_VM_CONTEXT_INVALID     BIT(0)

/*
 * VM-instruction error numbers
 */
+1 −0
Original line number Diff line number Diff line
@@ -86,6 +86,7 @@
#define VMX_FEATURE_USR_WAIT_PAUSE	( 2*32+ 26) /* Enable TPAUSE, UMONITOR, UMWAIT in guest */
#define VMX_FEATURE_ENCLV_EXITING	( 2*32+ 28) /* "" VM-Exit on ENCLV (leaf dependent) */
#define VMX_FEATURE_BUS_LOCK_DETECTION	( 2*32+ 30) /* "" VM-Exit when bus lock caused */
#define VMX_FEATURE_NOTIFY_VM_EXITING	( 2*32+ 31) /* VM-Exit when no event windows after notify window */

/* Tertiary Processor-Based VM-Execution Controls, word 3 */
#define VMX_TERTIARY_FEATURE_IPI_VIRT		( 3*32+  4) /* Enable IPI virtualization */
+3 −1
Original line number Diff line number Diff line
@@ -90,6 +90,7 @@
#define EXIT_REASON_UMWAIT              67
#define EXIT_REASON_TPAUSE              68
#define EXIT_REASON_BUS_LOCK            74
#define EXIT_REASON_NOTIFY              75

#define VMX_EXIT_REASONS \
	{ EXIT_REASON_EXCEPTION_NMI,         "EXCEPTION_NMI" }, \
@@ -151,7 +152,8 @@
	{ EXIT_REASON_XRSTORS,               "XRSTORS" }, \
	{ EXIT_REASON_UMWAIT,                "UMWAIT" }, \
	{ EXIT_REASON_TPAUSE,                "TPAUSE" }, \
	{ EXIT_REASON_BUS_LOCK,              "BUS_LOCK" }
	{ EXIT_REASON_BUS_LOCK,              "BUS_LOCK" }, \
	{ EXIT_REASON_NOTIFY,                "NOTIFY" }

#define VMX_EXIT_REASON_FLAGS \
	{ VMX_EXIT_REASONS_FAILED_VMENTRY,	"FAILED_VMENTRY" }
Loading