Skip to content
  1. Oct 31, 2021
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-5.16-1' of... · 9c6eb531
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: Fixes and Features for 5.16
      
      - SIGP Fixes
      - initial preparations for lazy destroy of secure VMs
      - storage key improvements/fixes
      - Log the guest CPNC
      9c6eb531
    • Anup Patel's avatar
      RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions · 7c8de080
      Anup Patel authored
      The parameter passed to HFENCE.GVMA instruction in rs1 register
      is guest physical address right shifted by 2 (i.e. divided by 4).
      
      Unfortunately, we overlooked the semantics of rs1 registers for
      HFENCE.GVMA instruction and never right shifted guest physical
      address by 2. This issue did not manifest for hypervisors till
      now because:
        1) Currently, only __kvm_riscv_hfence_gvma_all() and SBI
           HFENCE calls are used to invalidate TLB.
        2) All H-extension implementations (such as QEMU, Spike,
           Rocket Core FPGA, etc) that we tried till now were
           conservatively flushing everything upon any HFENCE.GVMA
           instruction.
      
      This patch fixes GPA passed to __kvm_riscv_hfence_gvma_vmid_gpa()
      and __kvm_riscv_hfence_gvma_gpa() functions.
      
      Fixes: fd7bb4a2
      
       ("RISC-V: KVM: Implement VMID allocator")
      Reported-by: default avatarIan Huang <ihuang@ventanamicro.com>
      Signed-off-by: default avatarAnup Patel <anup.patel@wdc.com>
      Message-Id: <20211026170136.2147619-4-anup.patel@wdc.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c8de080
    • Anup Patel's avatar
      RISC-V: KVM: Factor-out FP virtualization into separate sources · 0a86512d
      Anup Patel authored
      
      
      The timer and SBI virtualization is already in separate sources.
      In future, we will have vector and AIA virtualization also added
      as separate sources.
      
      To align with above described modularity, we factor-out FP
      virtualization into separate sources.
      
      Signed-off-by: default avatarAnup Patel <anup.patel@wdc.com>
      Message-Id: <20211026170136.2147619-3-anup.patel@wdc.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0a86512d
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · 4e338684
      Paolo Bonzini authored
      KVM/arm64 updates for Linux 5.16
      
      - More progress on the protected VM front, now with the full
        fixed feature set as well as the limitation of some hypercalls
        after initialisation.
      
      - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
        complicated
      
      - Fixes for the vgic placement in the IPA space, together with a
        bunch of selftests
      
      - More memcg accounting of the memory allocated on behalf of a guest
      
      - Timer and vgic selftests
      
      - Workarounds for the Apple M1 broken vgic implementation
      
      - KConfig cleanups
      
      - New kvmarm.mode=none option, for those who really dislike us
      4e338684
  2. Oct 27, 2021
  3. Oct 25, 2021
  4. Oct 23, 2021
  5. Oct 22, 2021
    • Sean Christopherson's avatar
      KVM: x86: Use rw_semaphore for APICv lock to allow vCPU parallelism · 187c8833
      Sean Christopherson authored
      
      
      Use a rw_semaphore instead of a mutex to coordinate APICv updates so that
      vCPUs responding to requests can take the lock for read and run in
      parallel.  Using a mutex forces serialization of vCPUs even though
      kvm_vcpu_update_apicv() only touches data local to that vCPU or is
      protected by a different lock, e.g. SVM's ir_list_lock.
      
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211022004927.1448382-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      187c8833
    • Sean Christopherson's avatar
      KVM: x86: Move SVM's APICv sanity check to common x86 · ee49a893
      Sean Christopherson authored
      
      
      Move SVM's assertion that vCPU's APICv state is consistent with its VM's
      state out of svm_vcpu_run() and into x86's common inner run loop.  The
      assertion and underlying logic is not unique to SVM, it's just that SVM
      has more inhibiting conditions and thus is more likely to run headfirst
      into any KVM bugs.
      
      Add relevant comments to document exactly why the update path has unusual
      ordering between the update the kick, why said ordering is safe, and also
      the basic rules behind the assertion in the run loop.
      
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211022004927.1448382-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ee49a893
    • Lukas Bulwahn's avatar
      riscv: do not select non-existing config ANON_INODES · 9b4eb770
      Lukas Bulwahn authored
      Commit 99cdc6c1 ("RISC-V: Add initial skeletal KVM support") selects
      the config ANON_INODES in config KVM, but the config ANON_INODES is removed
      since commit 5dd50aae
      
       ("Make anon_inodes unconditional") in 2018.
      
      Hence, ./scripts/checkkconfigsymbols.py warns on non-existing symbols:
      
        ANON_INODES
        Referencing files: arch/riscv/kvm/Kconfig
      
      Remove selecting the non-existing config ANON_INODES.
      
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Message-Id: <20211022061514.25946-1-lukas.bulwahn@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9b4eb770
    • Sean Christopherson's avatar
      KVM: x86/mmu: Extract zapping of rmaps for gfn range to separate helper · 21fa3246
      Sean Christopherson authored
      
      
      Extract the zapping of rmaps, a.k.a. legacy MMU, for a gfn range to a
      separate helper to clean up the unholy mess that kvm_zap_gfn_range() has
      become.  In addition to deep nesting, the rmaps zapping spreads out the
      declaration of several variables and is generally a mess.  Clean up the
      mess now so that future work to improve the memslots implementation
      doesn't need to deal with it.
      
      Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211022010005.1454978-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      21fa3246
    • Sean Christopherson's avatar
      KVM: x86/mmu: Drop a redundant remote TLB flush in kvm_zap_gfn_range() · e8be2a5b
      Sean Christopherson authored
      Remove an unnecessary remote TLB flush in kvm_zap_gfn_range() now that
      said function holds mmu_lock for write for its entire duration.  The
      flush was added by the now-reverted commit to allow TDP MMU to flush while
      holding mmu_lock for read, as the transition from write=>read required
      dropping the lock and thus a pending flush needed to be serviced.
      
      Fixes: 5a324c24
      
       ("Revert "KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock"")
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Cc: Ben Gardon <bgardon@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211022010005.1454978-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e8be2a5b
    • Sean Christopherson's avatar
      KVM: x86/mmu: Drop a redundant, broken remote TLB flush · bc3b3c10
      Sean Christopherson authored
      A recent commit to fix the calls to kvm_flush_remote_tlbs_with_address()
      in kvm_zap_gfn_range() inadvertantly added yet another flush instead of
      fixing the existing flush.  Drop the redundant flush, and fix the params
      for the existing flush.
      
      Cc: stable@vger.kernel.org
      Fixes: 2822da44
      
       ("KVM: x86/mmu: fix parameters to kvm_flush_remote_tlbs_with_address")
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Cc: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20211022010005.1454978-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bc3b3c10
    • Lai Jiangshan's avatar
      KVM: X86: Don't unload MMU in kvm_vcpu_flush_tlb_guest() · 61b05a9f
      Lai Jiangshan authored
      
      
      kvm_mmu_unload() destroys all the PGD caches.  Use the lighter
      kvm_mmu_sync_roots() and kvm_mmu_sync_prev_roots() instead.
      
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211019110154.4091-5-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      61b05a9f
    • Lai Jiangshan's avatar
      KVM: X86: pair smp_wmb() of mmu_try_to_unsync_pages() with smp_rmb() · 264d3dc1
      Lai Jiangshan authored
      The commit 578e1c4d
      
       ("kvm: x86: Avoid taking MMU lock
      in kvm_mmu_sync_roots if no sync is needed") added smp_wmb() in
      mmu_try_to_unsync_pages(), but the corresponding smp_load_acquire() isn't
      used on the load of SPTE.W.  smp_load_acquire() orders _subsequent_
      loads after sp->is_unsync; it does not order _earlier_ loads before
      the load of sp->is_unsync.
      
      This has no functional change; smp_rmb() is a NOP on x86, and no
      compiler barrier is required because there is a VMEXIT between the
      load of SPTE.W and kvm_mmu_snc_roots.
      
      Cc: Junaid Shahid <junaids@google.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211019110154.4091-4-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      264d3dc1
    • Lai Jiangshan's avatar
      KVM: X86: Cache CR3 in prev_roots when PCID is disabled · 509bfe3d
      Lai Jiangshan authored
      The commit 21823fbd
      
       ("KVM: x86: Invalidate all PGDs for the
      current PCID on MOV CR3 w/ flush") invalidates all PGDs for the specific
      PCID and in the case of PCID is disabled, it includes all PGDs in the
      prev_roots and the commit made prev_roots totally unused in this case.
      
      Not using prev_roots fixes a problem when CR4.PCIDE is changed 0 -> 1
      before the said commit:
      
      	(CR4.PCIDE=0, CR4.PGE=1; CR3=cr3_a; the page for the guest
      	 RIP is global; cr3_b is cached in prev_roots)
      
      	modify page tables under cr3_b
      		the shadow root of cr3_b is unsync in kvm
      	INVPCID single context
      		the guest expects the TLB is clean for PCID=0
      	change CR4.PCIDE 0 -> 1
      	switch to cr3_b with PCID=0,NOFLUSH=1
      		No sync in kvm, cr3_b is still unsync in kvm
      	jump to the page that was modified in step 1
      		shadow page tables point to the wrong page
      
      It is a very unlikely case, but it shows that stale prev_roots can be
      a problem after CR4.PCIDE changes from 0 to 1.  However, to fix this
      case, the commit disabled caching CR3 in prev_roots altogether when PCID
      is disabled.  Not all CPUs have PCID; especially the PCID support
      for AMD CPUs is kind of recent.  To restore the prev_roots optimization
      for CR4.PCIDE=0, flush the whole MMU (including all prev_roots) when
      CR4.PCIDE changes.
      
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20211019110154.4091-3-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      509bfe3d