Skip to content
  1. Jan 15, 2022
  2. Jan 08, 2022
    • Jing Liu's avatar
      kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID · 445ecdf7
      Jing Liu authored
      
      
      KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in
      CPUID[0xD] if they have not been requested with prctl. Otherwise
      a process which directly passes KVM_GET_SUPPORTED_CPUID to
      KVM_SET_CPUID2 would now fail even if it doesn't intend to use a
      dynamically enabled feature. Userspace must know that prctl is
      required and allocate >4K xstate buffer before setting any dynamic
      bit.
      
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJing Liu <jing2.liu@intel.com>
      Signed-off-by: default avatarYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-5-yang.zhong@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      445ecdf7
    • Jing Liu's avatar
      kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule · cc04b6a2
      Jing Liu authored
      
      
      CPUID.0xD.1.EBX enumerates the size of the XSAVE area (in compacted
      format) required by XSAVES. If CPUID.0xD.i.ECX[1] is set for a state
      component (i), this state component should be located on the next
      64-bytes boundary following the preceding state component in the
      compacted layout.
      
      Fix xstate_required_size() to follow the alignment rule. AMX is the
      first state component with 64-bytes alignment to catch this bug.
      
      Signed-off-by: default avatarJing Liu <jing2.liu@intel.com>
      Signed-off-by: default avatarYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-4-yang.zhong@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cc04b6a2
    • Thomas Gleixner's avatar
      x86/fpu: Prepare guest FPU for dynamically enabled FPU features · 36487e62
      Thomas Gleixner authored
      
      
      To support dynamically enabled FPU features for guests prepare the guest
      pseudo FPU container to keep track of the currently enabled xfeatures and
      the guest permissions.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJing Liu <jing2.liu@intel.com>
      Signed-off-by: default avatarYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-3-yang.zhong@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      36487e62
    • Thomas Gleixner's avatar
      x86/fpu: Extend fpu_xstate_prctl() with guest permissions · 980fe2fd
      Thomas Gleixner authored
      
      
      KVM requires a clear separation of host user space and guest permissions
      for dynamic XSTATE components.
      
      Add a guest permissions member to struct fpu and a separate set of prctl()
      arguments: ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM.
      
      The semantics are equivalent to the host user space permission control
      except for the following constraints:
      
        1) Permissions have to be requested before the first vCPU is created
      
        2) Permissions are frozen when the first vCPU is created to ensure
           consistency. Any attempt to expand permissions via the prctl() after
           that point is rejected.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJing Liu <jing2.liu@intel.com>
      Signed-off-by: default avatarYang Zhong <yang.zhong@intel.com>
      Message-Id: <20220105123532.12586-2-yang.zhong@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      980fe2fd
    • Michael Roth's avatar
      kvm: selftests: move ucall declarations into ucall_common.h · 96c1a628
      Michael Roth authored
      
      
      Now that core kvm_util declarations have special home in
      kvm_util_base.h, move ucall-related declarations out into a separate
      header.
      
      Signed-off-by: default avatarMichael Roth <michael.roth@amd.com>
      Message-Id: <20211210164620.11636-3-michael.roth@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      96c1a628
    • Michael Roth's avatar
      kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h · 7d9a662e
      Michael Roth authored
      
      
      Between helper macros and interfaces that will be introduced in
      subsequent patches, much of kvm_util.h would end up being declarations
      specific to ucall. Ideally these could be separated out into a separate
      header since they are not strictly required for writing guest tests and
      are mostly self-contained interfaces other than a reliance on a few
      core declarations like struct kvm_vm. This doesn't make a big
      difference as far as how tests will be compiled/written since all these
      interfaces will still be packaged up into a single/common libkvm.a used
      by all tests, but it is still nice to be able to compartmentalize to
      improve readabilty and reduce merge conflicts in the future for common
      tasks like adding new interfaces to kvm_util.h.
      
      Furthermore, some of the ucall declarations will be arch-specific,
      requiring various #ifdef'ery in kvm_util.h. Ideally these declarations
      could live in separate arch-specific headers, e.g.
      include/<arch>/ucall.h, which would handle arch-specific declarations
      as well as pulling in common ucall-related declarations shared by all
      archs.
      
      One simple way to do this would be to #include ucall.h at the bottom of
      kvm_util.h, after declarations it relies upon like struct kvm_vm.
      This is brittle however, and doesn't scale easily to other sets of
      interfaces that may be added in the future.
      
      Instead, move all declarations currently in kvm_util.h into
      kvm_util_base.h, then have kvm_util.h #include it. With this change,
      non-base declarations can be selectively moved/introduced into separate
      headers, which can then be included in kvm_util.h so that individual
      tests don't need to be touched. Subsequent patches will then move
      ucall-related declarations into a separate header to meet the above
      goals.
      
      Signed-off-by: default avatarMichael Roth <michael.roth@amd.com>
      Message-Id: <20211210164620.11636-2-michael.roth@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7d9a662e
  3. Jan 07, 2022
    • Michael Roth's avatar
      KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests · 405329fc
      Michael Roth authored
      
      
      Normally guests will set up CR3 themselves, but some guests, such as
      kselftests, and potentially CONFIG_PVH guests, rely on being booted
      with paging enabled and CR3 initialized to a pre-allocated page table.
      
      Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest
      VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this
      is too late, since it will have switched over to using the VMSA page
      prior to that point, with the VMSA CR3 copied from the VMCB initial
      CR3 value: 0.
      
      Address this by sync'ing the CR3 value into the VMCB save area
      immediately when KVM_SET_SREGS* is issued so it will find it's way into
      the initial VMSA.
      
      Suggested-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarMichael Roth <michael.roth@amd.com>
      Message-Id: <20211216171358.61140-10-michael.roth@amd.com>
      [Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the
       new hook. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      405329fc
    • Peter Zijlstra's avatar
      KVM: VMX: Provide vmread version using asm-goto-with-outputs · 907d1393
      Peter Zijlstra authored
      
      
      Use asm-goto-output for smaller fast path code.
      
      Message-Id: <YbcbbGW2GcMx6KpD@hirez.programming.kicks-ass.net>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      907d1393
    • David Woodhouse's avatar
      KVM: x86: Fix wall clock writes in Xen shared_info not to mark page dirty · 55749769
      David Woodhouse authored
      When dirty ring logging is enabled, any dirty logging without an active
      vCPU context will cause a kernel oops. But we've already declared that
      the shared_info page doesn't get dirty tracking anyway, since it would
      be kind of insane to mark it dirty every time we deliver an event channel
      interrupt. Userspace is supposed to just assume it's always dirty any
      time a vCPU can run or event channels are routed.
      
      So stop using the generic kvm_write_wall_clock() and just write directly
      through the gfn_to_pfn_cache that we already have set up.
      
      We can make kvm_write_wall_clock() static in x86.c again now, but let's
      not remove the 'sec_hi_ofs' argument even though it's not used yet. At
      some point we *will* want to use that for KVM guests too.
      
      Fixes: 629b5348
      
       ("KVM: x86/xen: update wallclock region")
      Reported-by: default avatarbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-6-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      55749769
    • David Woodhouse's avatar
      KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery · 14243b38
      David Woodhouse authored
      
      
      This adds basic support for delivering 2 level event channels to a guest.
      
      Initially, it only supports delivery via the IRQ routing table, triggered
      by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast()
      function which will use the pre-mapped shared_info page if it already
      exists and is still valid, while the slow path through the irqfd_inject
      workqueue will remap the shared_info page if necessary.
      
      It sets the bits in the shared_info page but not the vcpu_info; that is
      deferred to __kvm_xen_has_interrupt() which raises the vector to the
      appropriate vCPU.
      
      Add a 'verbose' mode to xen_shinfo_test while adding test cases for this.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-5-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      14243b38
    • David Woodhouse's avatar
      KVM: x86/xen: Maintain valid mapping of Xen shared_info page · 1cfc9c4b
      David Woodhouse authored
      
      
      Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping
      of the Xen shared_info page so that it can be accessed in atomic context.
      
      Note that we do not participate in dirty tracking for the shared info
      page and we do not explicitly mark it dirty every single tim we deliver
      an event channel interrupts. We wouldn't want to do that even if we *did*
      have a valid vCPU context with which to do so.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-4-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1cfc9c4b
    • David Woodhouse's avatar
      KVM: Reinstate gfn_to_pfn_cache with invalidation support · 982ed0de
      David Woodhouse authored
      
      
      This can be used in two modes. There is an atomic mode where the cached
      mapping is accessed while holding the rwlock, and a mode where the
      physical address is used by a vCPU in guest mode.
      
      For the latter case, an invalidation will wake the vCPU with the new
      KVM_REQ_GPC_INVALIDATE, and the architecture will need to refresh any
      caches it still needs to access before entering guest mode again.
      
      Only one vCPU can be targeted by the wake requests; it's simple enough
      to make it wake all vCPUs or even a mask but I don't see a use case for
      that additional complexity right now.
      
      Invalidation happens from the invalidate_range_start MMU notifier, which
      needs to be able to sleep in order to wake the vCPU and wait for it.
      
      This means that revalidation potentially needs to "wait" for the MMU
      operation to complete and the invalidate_range_end notifier to be
      invoked. Like the vCPU when it takes a page fault in that period, we
      just spin — fixing that in a future patch by implementing an actual
      *wait* may be another part of shaving this particularly hirsute yak.
      
      As noted in the comments in the function itself, the only case where
      the invalidate_range_start notifier is expected to be called *without*
      being able to sleep is when the OOM reaper is killing the process. In
      that case, we expect the vCPU threads already to have exited, and thus
      there will be nothing to wake, and no reason to wait. So we clear the
      KVM_REQUEST_WAIT bit and send the request anyway, then complain loudly
      if there actually *was* anything to wake up.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-3-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      982ed0de
    • David Woodhouse's avatar
      KVM: Warn if mark_page_dirty() is called without an active vCPU · 2efd61a6
      David Woodhouse authored
      The various kvm_write_guest() and mark_page_dirty() functions must only
      ever be called in the context of an active vCPU, because if dirty ring
      tracking is enabled it may simply oops when kvm_get_running_vcpu()
      returns NULL for the vcpu and then kvm_dirty_ring_get() dereferences it.
      
      This oops was reported by "butt3rflyh4ck" <butterflyhuangxx@gmail.com> in
      https://lore.kernel.org/kvm/CAFcO6XOmoS7EacN_n6v4Txk7xL7iqRa2gABg3F7E3Naf5uG94g@mail.gmail.com/
      
      
      
      That actual bug will be fixed under separate cover but this warning
      should help to prevent new ones from being added.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211210163625.2886-2-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2efd61a6
    • David Woodhouse's avatar
      x86/kvm: Silence per-cpu pr_info noise about KVM clocks and steal time · f3f26dae
      David Woodhouse authored
      
      
      I made the actual CPU bringup go nice and fast... and then Linux spends
      half a minute printing stupid nonsense about clocks and steal time for
      each of 256 vCPUs. Don't do that. Nobody cares.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20211209150938.3518-12-dwmw2@infradead.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f3f26dae
    • Eric Hankland's avatar
      KVM: x86: Update vPMCs when retiring branch instructions · 018d70ff
      Eric Hankland authored
      
      
      When KVM retires a guest branch instruction through emulation,
      increment any vPMCs that are configured to monitor "branch
      instructions retired," and update the sample period of those counters
      so that they will overflow at the right time.
      
      Signed-off-by: default avatarEric Hankland <ehankland@google.com>
      [jmattson:
        - Split the code to increment "branch instructions retired" into a
          separate commit.
        - Moved/consolidated the calls to kvm_pmu_trigger_event() in the
          emulation of VMLAUNCH/VMRESUME to accommodate the evolution of
          that code.
      ]
      Fixes: f5132b01
      
       ("KVM: Expose a version 2 architectural PMU to a guests")
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Message-Id: <20211130074221.93635-7-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      018d70ff
    • Eric Hankland's avatar
      KVM: x86: Update vPMCs when retiring instructions · 9cd803d4
      Eric Hankland authored
      
      
      When KVM retires a guest instruction through emulation, increment any
      vPMCs that are configured to monitor "instructions retired," and
      update the sample period of those counters so that they will overflow
      at the right time.
      
      Signed-off-by: default avatarEric Hankland <ehankland@google.com>
      [jmattson:
        - Split the code to increment "branch instructions retired" into a
          separate commit.
        - Added 'static' to kvm_pmu_incr_counter() definition.
        - Modified kvm_pmu_incr_counter() to check pmc->perf_event->state ==
          PERF_EVENT_STATE_ACTIVE.
      ]
      Fixes: f5132b01
      
       ("KVM: Expose a version 2 architectural PMU to a guests")
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      [likexu:
        - Drop checks for pmc->perf_event or event state or event type
        - Increase a counter once its umask bits and the first 8 select bits are matched
        - Rewrite kvm_pmu_incr_counter() with a less invasive approach to the host perf;
        - Rename kvm_pmu_record_event to kvm_pmu_trigger_event;
        - Add counter enable and CPL check for kvm_pmu_trigger_event();
      ]
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-6-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9cd803d4
    • Like Xu's avatar
      KVM: x86/pmu: Add pmc->intr to refactor kvm_perf_overflow{_intr}() · 40ccb96d
      Like Xu authored
      
      
      Depending on whether intr should be triggered or not, KVM registers
      two different event overflow callbacks in the perf_event context.
      
      The code skeleton of these two functions is very similar, so
      the pmc->intr can be stored into pmc from pmc_reprogram_counter()
      which provides smaller instructions footprint against the
      u-architecture branch predictor.
      
      The __kvm_perf_overflow() can be called in non-nmi contexts
      and a flag is needed to distinguish the caller context and thus
      avoid a check on kvm_is_in_guest(), otherwise we might get
      warnings from suspicious RCU or check_preemption_disabled().
      
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-5-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      40ccb96d
    • Like Xu's avatar
      KVM: x86/pmu: Reuse pmc_perf_hw_id() and drop find_fixed_event() · 6ed1298e
      Like Xu authored
      
      
      Since we set the same semantic event value for the fixed counter in
      pmc->eventsel, returning the perf_hw_id for the fixed counter via
      find_fixed_event() can be painlessly replaced by pmc_perf_hw_id()
      with the help of pmc_is_fixed() check.
      
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-4-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6ed1298e
    • Like Xu's avatar
      KVM: x86/pmu: Refactoring find_arch_event() to pmc_perf_hw_id() · 7c174f30
      Like Xu authored
      
      
      The find_arch_event() returns a "unsigned int" value,
      which is used by the pmc_reprogram_counter() to
      program a PERF_TYPE_HARDWARE type perf_event.
      
      The returned value is actually the kernel defined generic
      perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler
      incoming parameters for better self-explanation.
      
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-3-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c174f30
    • Like Xu's avatar
      KVM: x86/pmu: Setup pmc->eventsel for fixed PMCs · 76187563
      Like Xu authored
      
      
      The current pmc->eventsel for fixed counter is underutilised. The
      pmc->eventsel can be setup for all known available fixed counters
      since we have mapping between fixed pmc index and
      the intel_arch_events array.
      
      Either gp or fixed counter, it will simplify the later checks for
      consistency between eventsel and perf_hw_id.
      
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Message-Id: <20211130074221.93635-2-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      76187563
    • Paolo Bonzini's avatar
      KVM: x86: avoid out of bounds indices for fixed performance counters · 006a0f06
      Paolo Bonzini authored
      
      
      Because IceLake has 4 fixed performance counters but KVM only
      supports 3, it is possible for reprogram_fixed_counters to pass
      to reprogram_fixed_counter an index that is out of bounds for the
      fixed_pmc_events array.
      
      Ultimately intel_find_fixed_event, which is the only place that uses
      fixed_pmc_events, handles this correctly because it checks against the
      size of fixed_pmc_events anyway.  Every other place operates on the
      fixed_counters[] array which is sized according to INTEL_PMC_MAX_FIXED.
      However, it is cleaner if the unsupported performance counters are culled
      early on in reprogram_fixed_counters.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      006a0f06
    • Lai Jiangshan's avatar
      KVM: VMX: Mark VCPU_EXREG_CR3 dirty when !CR0_PG -> CR0_PG if EPT + !URG · 5b61178c
      Lai Jiangshan authored
      
      
      When !CR0_PG -> CR0_PG, vcpu->arch.cr3 becomes active, but GUEST_CR3 is
      still vmx->ept_identity_map_addr if EPT + !URG.  So VCPU_EXREG_CR3 is
      considered to be dirty and GUEST_CR3 needs to be updated in this case.
      
      Reported-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211216021938.11752-4-jiangshanlai@gmail.com>
      Fixes: c62c7bd4
      
       ("KVM: VMX: Update vmcs.GUEST_CR3 only when the guest CR3 is dirty")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5b61178c
    • Lai Jiangshan's avatar
      KVM: x86/mmu: Reconstruct shadow page root if the guest PDPTEs is changed · 6b123c3a
      Lai Jiangshan authored
      For shadow paging, the page table needs to be reconstructed before the
      coming VMENTER if the guest PDPTEs is changed.
      
      But not all paths that call load_pdptrs() will cause the page tables to be
      reconstructed. Normally, kvm_mmu_reset_context() and kvm_mmu_free_roots()
      are used to launch later reconstruction.
      
      The commit d81135a5("KVM: x86: do not reset mmu if CR0.CD and
      CR0.NW are changed") skips kvm_mmu_reset_context() after load_pdptrs()
      when changing CR0.CD and CR0.NW.
      
      The commit 21823fbd("KVM: x86: Invalidate all PGDs for the current
      PCID on MOV CR3 w/ flush") skips kvm_mmu_free_roots() after
      load_pdptrs() when rewriting the CR3 with the same value.
      
      The commit a91a7c70("KVM: X86: Don't reset mmu context when
      toggling X86_CR4_PGE") skips kvm_mmu_reset_context() after
      load_pdptrs() when changing CR4.PGE.
      
      Guests like linux would keep the PDPTEs unchanged for every instance of
      pagetable, so this missing reconstruction has no problem for linux
      guests.
      
      Fixes: d81135a5("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")
      Fixes: 21823fbd("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush")
      Fixes: a91a7c70
      
      ("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE")
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211216021938.11752-3-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6b123c3a
    • Lai Jiangshan's avatar
      KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs() · a9f2705e
      Lai Jiangshan authored
      The host CR3 in the vcpu thread can only be changed when scheduling,
      so commit 15ad9762 ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
      changed vmx.c to only save it in vmx_prepare_switch_to_guest().
      
      However, it also has to be synced in vmx_sync_vmcs_host_state() when switching VMCS.
      vmx_set_host_fs_gs() is called in both places, so rename it to
      vmx_set_vmcs_host_state() and make it update HOST_CR3.
      
      Fixes: 15ad9762
      
       ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211216021938.11752-2-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a9f2705e
    • Paolo Bonzini's avatar
      Revert "KVM: X86: Update mmu->pdptrs only when it is changed" · 46cbc040
      Paolo Bonzini authored
      This reverts commit 24cd19a2.
      Sean Christopherson reports:
      
      "Commit 24cd19a2
      
       ('KVM: X86: Update mmu->pdptrs only when it is
      changed') breaks nested VMs with EPT in L0 and PAE shadow paging in L2.
      Reproducing is trivial, just disable EPT in L1 and run a VM.  I haven't
      investigating how it breaks things."
      
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      46cbc040
    • Peter Gonda's avatar
      selftests: KVM: sev_migrate_tests: Add mirror command tests · a6fec539
      Peter Gonda authored
      
      
      Add tests to confirm mirror vms can only run correct subset of commands.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Signed-off-by: default avatarPeter Gonda <pgonda@google.com>
      Message-Id: <20211208191642.3792819-4-pgonda@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a6fec539
    • Peter Gonda's avatar
      selftests: KVM: sev_migrate_tests: Fix sev_ioctl() · 427d046a
      Peter Gonda authored
      
      
      TEST_ASSERT in SEV ioctl was allowing errors because it checked return
      value was good OR the FW error code was OK. This TEST_ASSERT should
      require both (aka. AND) values are OK. Removes the LAUNCH_START from the
      mirror VM because this call correctly fails because mirror VMs cannot
      call this command. Currently issues with the PSP driver functions mean
      the firmware error is not always reset to SEV_RET_SUCCESS when a call is
      successful. Mainly sev_platform_init() doesn't correctly set the fw
      error if the platform has already been initialized.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Signed-off-by: default avatarPeter Gonda <pgonda@google.com>
      Message-Id: <20211208191642.3792819-3-pgonda@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      427d046a