Skip to content
  1. Mar 16, 2019
    • Paolo Bonzini's avatar
      kvm: vmx: fix formatting of a comment · 4a605bc0
      Paolo Bonzini authored
      
      
      Eliminate a gratuitous conflict with 5.0.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4a605bc0
    • Sean Christopherson's avatar
      KVM: doc: Document the life cycle of a VM and its resources · eca6be56
      Sean Christopherson authored
      The series to add memcg accounting to KVM allocations[1] states:
      
        There are many KVM kernel memory allocations which are tied to the
        life of the VM process and should be charged to the VM process's
        cgroup.
      
      While it is correct to account KVM kernel allocations to the cgroup of
      the process that created the VM, it's technically incorrect to state
      that the KVM kernel memory allocations are tied to the life of the VM
      process.  This is because the VM itself, i.e. struct kvm, is not tied to
      the life of the process which created it, rather it is tied to the life
      of its associated file descriptor.  In other words, kvm_destroy_vm() is
      not invoked until fput() decrements its associated file's refcount to
      zero.  A simple example is to fork() in Qemu and have the child sleep
      indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file
      descriptor *and* the rogue child is killed.
      
      The allocations are guaranteed to be *accounted* to the process which
      created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the
      ioctl() with -EIO if kvm->mm != current->mm.  I.e. the child can keep
      the VM "alive" but can't do anything useful with its reference.
      
      Note that because 'struct kvm' also holds a reference to the mm_struct
      of its owner, the above behavior also applies to userspace allocations.
      
      Given that mucking with a VM's file descriptor can lead to subtle and
      undesirable behavior, e.g. memcg charges persisting after a VM is shut
      down, explicitly document a VM's lifecycle and its impact on the VM's
      resources.
      
      Alternatively, KVM could aggressively free resources when the creating
      process exits, e.g. via mmu_notifier->release().  However, mmu_notifier
      isn't guaranteed to be available, and freeing resources when the creator
      exits is likely to be error prone and fragile as KVM would need to
      ensure that it only freed resources that are truly out of reach. In
      practice, the existing behavior shouldn't be problematic as a properly
      configured system will prevent a child process from being moved out of
      the appropriate cgroup hierarchy, i.e. prevent hiding the process from
      the OOM killer, and will prevent an unprivileged user from being able to
      to hold a reference to struct kvm via another method, e.g. debugfs.
      
      [1]https://patchwork.kernel.org/patch/10806707/
      
      
      
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eca6be56
    • Paolo Bonzini's avatar
      Merge tag 'kvm-ppc-next-5.1-3' of... · c7a0e83c
      Paolo Bonzini authored
      Merge tag 'kvm-ppc-next-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
      
      Third PPC KVM update for 5.1
      
      - Tell userspace about whether a particular hardware workaround for
        one of the Spectre vulnerabilities is available, so that userspace
        can inform the guest.
      c7a0e83c
    • Sean Christopherson's avatar
      MAINTAINERS: Add KVM selftests to existing KVM entry · 46333236
      Sean Christopherson authored
      
      
      It's safe to assume Paolo and Radim are maintaining the KVM selftests
      given that the vast majority of commits have their SOBs.  Play nice
      with get_maintainers and make it official.
      
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      46333236
    • Ben Gardon's avatar
      Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()" · 92da008f
      Ben Gardon authored
      This reverts commit 71883a62
      
      .
      
      The above commit contains an optimization to kvm_zap_gfn_range which
      uses gfn-limited TLB flushes, if enabled. If using these limited flushes,
      kvm_zap_gfn_range passes lock_flush_tlb=false to slot_handle_level_range
      which creates a race when the function unlocks to call cond_resched.
      See an example of this race below:
      
      CPU 0                   CPU 1                           CPU 3
      // zap_direct_gfn_range
      mmu_lock()
      // *ptep == pte_1
      *ptep = 0
      if (lock_flush_tlb)
              flush_tlbs()
      mmu_unlock()
                              // In invalidate range
                              // MMU notifier
                              mmu_lock()
                              if (pte != 0)
                                      *ptep = 0
                                      flush = true
                              if (flush)
                                      flush_remote_tlbs()
                              mmu_unlock()
                              return
                              // Host MM reallocates
                              // page previously
                              // backing guest memory.
                                                              // Guest accesses
                                                              // invalid page
                                                              // through pte_1
                                                              // in its TLB!!
      
      Tested: Ran all kvm-unit-tests on a Intel Haswell machine with and
      	without this patch. The patch introduced no new failures.
      
      Signed-off-by: default avatarBen Gardon <bgardon@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      92da008f
  2. Mar 01, 2019
  3. Feb 27, 2019
    • Paul Mackerras's avatar
      KVM: PPC: Fix compilation when KVM is not enabled · e74d53e3
      Paul Mackerras authored
      Compiling with CONFIG_PPC_POWERNV=y and KVM disabled currently gives
      an error like this:
      
        CC      arch/powerpc/kernel/dbell.o
      In file included from arch/powerpc/kernel/dbell.c:20:0:
      arch/powerpc/include/asm/kvm_ppc.h: In function ‘xics_on_xive’:
      arch/powerpc/include/asm/kvm_ppc.h:625:9: error: implicit declaration of function ‘xive_enabled’ [-Werror=implicit-function-declaration]
        return xive_enabled() && cpu_has_feature(CPU_FTR_HVMODE);
               ^
      cc1: all warnings being treated as errors
      scripts/Makefile.build:276: recipe for target 'arch/powerpc/kernel/dbell.o' failed
      make[3]: *** [arch/powerpc/kernel/dbell.o] Error 1
      
      Fix this by making the xics_on_xive() definition conditional on the
      same symbol (CONFIG_KVM_BOOK3S_64_HANDLER) that determines whether we
      include <asm/xive.h> or not, since that's the header that defines
      xive_enabled().
      
      Fixes: 03f95332
      
       ("KVM: PPC: Book3S: Allow XICS emulation to work in nested hosts using XIVE")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      e74d53e3
  4. Feb 23, 2019
  5. Feb 22, 2019
  6. Feb 21, 2019