Skip to content
  1. Aug 21, 2019
  2. Aug 15, 2019
  3. Aug 14, 2019
    • Miaohe Lin's avatar
      KVM: x86: svm: remove redundant assignment of var new_entry · c8e174b3
      Miaohe Lin authored
      
      
      new_entry is reassigned a new value next line. So
      it's redundant and remove it.
      
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c8e174b3
    • Paolo Bonzini's avatar
      MAINTAINERS: add KVM x86 reviewers · ed4e7b05
      Paolo Bonzini authored
      
      
      This is probably overdue---KVM x86 has quite a few contributors that
      usually review each other's patches, which is really helpful to me.
      Formalize this by listing them as reviewers.  I am including people
      with various expertise:
      
      - Joerg for SVM (with designated reviewers, it makes more sense to have
      him in the main KVM/x86 stanza)
      
      - Sean for MMU and VMX
      
      - Jim for VMX
      
      - Vitaly for Hyper-V and possibly SVM
      
      - Wanpeng for LAPIC and paravirtualization.
      
      Please ack if you are okay with this arrangement, otherwise speak up.
      
      In other news, Radim is going to leave Red Hat soon.  However, he has
      not been very much involved in upstream KVM development for some time,
      and in the immediate future he is still going to help maintain kvm/queue
      while I am on vacation.  Since not much is going to change, I will let
      him decide whether he wants to keep the maintainer role after he leaves.
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Acked-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ed4e7b05
    • Paolo Bonzini's avatar
      MAINTAINERS: change list for KVM/s390 · 74260dc2
      Paolo Bonzini authored
      
      
      KVM/s390 does not have a list of its own, and linux-s390 is in the
      loop anyway thanks to the generic arch/s390 match.  So use the generic
      KVM list for s390 patches.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      74260dc2
    • Radim Krcmar's avatar
      kvm: x86: skip populating logical dest map if apic is not sw enabled · b14c876b
      Radim Krcmar authored
      
      
      recalculate_apic_map does not santize ldr and it's possible that
      multiple bits are set. In that case, a previous valid entry
      can potentially be overwritten by an invalid one.
      
      This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
      triggering a crash to boot a kdump kernel. This is the sequence of
      events:
      1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
      writes to the LDR which probably will never be used.
      2. However, when booting into kdump, the stale LDR values remain as
      they are not cleared by the guest and there isn't a apic reset.
      3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
      logical map has been overwritten and points to an inactive vcpu.
      
      Signed-off-by: default avatarRadim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: default avatarBandan Das <bsd@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b14c876b
  4. Aug 09, 2019
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-for-5.3-2' of... · a738b5e7
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-for-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm fixes for 5.3, take #2
      
      - Fix our system register reset so that we stop writing
        non-sensical values to them, and track which registers
        get reset instead.
      - Sync VMCR back from the GIC on WFI so that KVM has an
        exact vue of PMR.
      - Reevaluate state of HW-mapped, level triggered interrupts
        on enable.
      a738b5e7
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-for-5.3' of... · 0e1c438c
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm fixes for 5.3
      
      - A bunch of switch/case fall-through annotation, fixing one actual bug
      - Fix PMU reset bug
      - Add missing exception class debug strings
      0e1c438c
    • Naresh Kamboju's avatar
      selftests: kvm: Adding config fragments · c096397c
      Naresh Kamboju authored
      
      
      selftests kvm test cases need pre-required kernel configs for the test
      to get pass.
      
      Signed-off-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c096397c
    • Thomas Huth's avatar
      KVM: selftests: Update gitignore file for latest changes · e2c26537
      Thomas Huth authored
      
      
      The kvm_create_max_vcpus test has been moved to the main directory,
      and sync_regs_test is now available on s390x, too.
      
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Acked-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e2c26537
    • Paolo Bonzini's avatar
      kvm: remove unnecessary PageReserved check · 8f946da7
      Paolo Bonzini authored
      
      
      The same check is already done in kvm_is_reserved_pfn.
      
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8f946da7
    • Alexandru Elisei's avatar
      KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable · 16e604a4
      Alexandru Elisei authored
      
      
      A HW mapped level sensitive interrupt asserted by a device will not be put
      into the ap_list if it is disabled at the VGIC level. When it is enabled
      again, it will be inserted into the ap_list and written to a list register
      on guest entry regardless of the state of the device.
      
      We could argue that this can also happen on real hardware, when the command
      to enable the interrupt reached the GIC before the device had the chance to
      de-assert the interrupt signal; however, we emulate the distributor and
      redistributors in software and we can do better than that.
      
      Signed-off-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      16e604a4
    • Marc Zyngier's avatar
      KVM: arm: Don't write junk to CP15 registers on reset · c69509c7
      Marc Zyngier authored
      
      
      At the moment, the way we reset CP15 registers is mildly insane:
      We write junk to them, call the reset functions, and then check that
      we have something else in them.
      
      The "fun" thing is that this can happen while the guest is running
      (PSCI, for example). If anything in KVM has to evaluate the state
      of a CP15 register while junk is in there, bad thing may happen.
      
      Let's stop doing that. Instead, we track that we have called a
      reset function for that register, and assume that the reset
      function has done something.
      
      In the end, the very need of this reset check is pretty dubious,
      as it doesn't check everything (a lot of the CP15 reg leave outside
      of the cp15_regs[] array). It may well be axed in the near future.
      
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      c69509c7
    • Marc Zyngier's avatar
      KVM: arm64: Don't write junk to sysregs on reset · 03fdfb26
      Marc Zyngier authored
      
      
      At the moment, the way we reset system registers is mildly insane:
      We write junk to them, call the reset functions, and then check that
      we have something else in them.
      
      The "fun" thing is that this can happen while the guest is running
      (PSCI, for example). If anything in KVM has to evaluate the state
      of a system register while junk is in there, bad thing may happen.
      
      Let's stop doing that. Instead, we track that we have called a
      reset function for that register, and assume that the reset
      function has done something. This requires fixing a couple of
      sysreg refinition in the trap table.
      
      In the end, the very need of this reset check is pretty dubious,
      as it doesn't check everything (a lot of the sysregs leave outside of
      the sys_regs[] array). It may well be axed in the near future.
      
      Tested-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      03fdfb26
  5. Aug 05, 2019
    • Marc Zyngier's avatar
      KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block · 5eeaf10e
      Marc Zyngier authored
      Since commit commit 328e5664 ("KVM: arm/arm64: vgic: Defer
      touching GICH_VMCR to vcpu_load/put"), we leave ICH_VMCR_EL2 (or
      its GICv2 equivalent) loaded as long as we can, only syncing it
      back when we're scheduled out.
      
      There is a small snag with that though: kvm_vgic_vcpu_pending_irq(),
      which is indirectly called from kvm_vcpu_check_block(), needs to
      evaluate the guest's view of ICC_PMR_EL1. At the point were we
      call kvm_vcpu_check_block(), the vcpu is still loaded, and whatever
      changes to PMR is not visible in memory until we do a vcpu_put().
      
      Things go really south if the guest does the following:
      
      	mov x0, #0	// or any small value masking interrupts
      	msr ICC_PMR_EL1, x0
      
      	[vcpu preempted, then rescheduled, VMCR sampled]
      
      	mov x0, #ff	// allow all interrupts
      	msr ICC_PMR_EL1, x0
      	wfi		// traps to EL2, so samping of VMCR
      
      	[interrupt arrives just after WFI]
      
      Here, the hypervisor's view of PMR is zero, while the guest has enabled
      its interrupts. kvm_vgic_vcpu_pending_irq() will then say that no
      interrupts are pending (despite an interrupt being received) and we'll
      block for no reason. If the guest doesn't have a periodic interrupt
      firing once it has blocked, it will stay there forever.
      
      To avoid this unfortuante situation, let's resync VMCR from
      kvm_arch_vcpu_blocking(), ensuring that a following kvm_vcpu_check_block()
      will observe the latest value of PMR.
      
      This has been found by booting an arm64 Linux guest with the pseudo NMI
      feature, and thus using interrupt priorities to mask interrupts instead
      of the usual PSTATE masking.
      
      Cc: stable@vger.kernel.org # 4.12
      Fixes: 328e5664
      
       ("KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put")
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      5eeaf10e
    • Paolo Bonzini's avatar
      x86: kvm: remove useless calls to kvm_para_available · 57b76bdb
      Paolo Bonzini authored
      
      
      Most code in arch/x86/kernel/kvm.c is called through x86_hyper_kvm, and thus only
      runs if KVM has been detected.  There is no need to check again for the CPUID
      base.
      
      Cc: Sergio Lopez <slp@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      57b76bdb
    • Greg KH's avatar
      KVM: no need to check return value of debugfs_create functions · 3e7093d0
      Greg KH authored
      
      
      When calling debugfs functions, there is no need to ever check the
      return value.  The function can work or not, but the code logic should
      never do something different based on this.
      
      Also, when doing this, change kvm_arch_create_vcpu_debugfs() to return
      void instead of an integer, as we should not care at all about if this
      function actually does anything or not.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <x86@kernel.org>
      Cc: <kvm@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3e7093d0
    • Paolo Bonzini's avatar
      KVM: remove kvm_arch_has_vcpu_debugfs() · 741cbbae
      Paolo Bonzini authored
      
      
      There is no need for this function as all arches have to implement
      kvm_arch_create_vcpu_debugfs() no matter what.  A #define symbol
      let us actually simplify the code.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      741cbbae
    • Wanpeng Li's avatar
      KVM: Fix leak vCPU's VMCS value into other pCPU · 17e433b5
      Wanpeng Li authored
      After commit d73eb57b (KVM: Boost vCPUs that are delivering interrupts), a
      five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs
      on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting
      in the VMs after stress testing:
      
       INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073)
       Call Trace:
         flush_tlb_mm_range+0x68/0x140
         tlb_flush_mmu.part.75+0x37/0xe0
         tlb_finish_mmu+0x55/0x60
         zap_page_range+0x142/0x190
         SyS_madvise+0x3cd/0x9c0
         system_call_fastpath+0x1c/0x21
      
      swait_active() sustains to be true before finish_swait() is called in
      kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account
      by kvm_vcpu_on_spin() loop greatly increases the probability condition
      kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv
      is enabled the yield-candidate vCPU's VMCS RVI field leaks(by
      vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current
      VMCS.
      
      This patch fixes it by checking conservatively a subset of events.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Marc Zyngier <Marc.Zyngier@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 98f4a146
      
       (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop)
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      17e433b5
    • Wanpeng Li's avatar
      KVM: Check preempted_in_kernel for involuntary preemption · 046ddeed
      Wanpeng Li authored
      
      
      preempted_in_kernel is updated in preempt_notifier when involuntary preemption
      ocurrs, it can be stale when the voluntarily preempted vCPUs are taken into
      account by kvm_vcpu_on_spin() loop. This patch lets it just check preempted_in_kernel
      for involuntary preemption.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      046ddeed
    • Wanpeng Li's avatar
      KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire · a48d06f9
      Wanpeng Li authored
      
      
      kvm_set_pending_timer() will take care to wake up the sleeping vCPU which
      has pending timer, don't need to check this in apic_timer_expired() again.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a48d06f9
  6. Jul 29, 2019
    • Anders Roxell's avatar
      arm64: KVM: hyp: debug-sr: Mark expected switch fall-through · cdb2d3ee
      Anders Roxell authored
      
      
      When fall-through warnings was enabled by default the following warnings
      was starting to show up:
      
      ../arch/arm64/kvm/hyp/debug-sr.c: In function ‘__debug_save_state’:
      ../arch/arm64/kvm/hyp/debug-sr.c:20:19: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
        case 15: ptr[15] = read_debug(reg, 15);   \
      ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’
        save_debug(dbg->dbg_bcr, dbgbcr, brps);
        ^~~~~~~~~~
      ../arch/arm64/kvm/hyp/debug-sr.c:21:2: note: here
        case 14: ptr[14] = read_debug(reg, 14);   \
        ^~~~
      ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’
        save_debug(dbg->dbg_bcr, dbgbcr, brps);
        ^~~~~~~~~~
      ../arch/arm64/kvm/hyp/debug-sr.c:21:19: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
        case 14: ptr[14] = read_debug(reg, 14);   \
      ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’
        save_debug(dbg->dbg_bcr, dbgbcr, brps);
        ^~~~~~~~~~
      ../arch/arm64/kvm/hyp/debug-sr.c:22:2: note: here
        case 13: ptr[13] = read_debug(reg, 13);   \
        ^~~~
      ../arch/arm64/kvm/hyp/debug-sr.c:113:2: note: in expansion of macro ‘save_debug’
        save_debug(dbg->dbg_bcr, dbgbcr, brps);
        ^~~~~~~~~~
      
      Rework to add a 'Fall through' comment where the compiler warned
      about fall-through, hence silencing the warning.
      
      Fixes: d93512ef0f0e ("Makefile: Globally enable fall-through warning")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      [maz: fixed commit message]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      cdb2d3ee
  7. Jul 26, 2019
    • Zenghui Yu's avatar
      KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC · 6701c619
      Zenghui Yu authored
      
      
      We've added two ESR exception classes for new ARM hardware extensions:
      ESR_ELx_EC_PAC and ESR_ELx_EC_SVE, but failed to update the strings
      used in tracing and other debug.
      
      Let's update "kvm_arm_exception_class" for these two EC, which the
      new EC will be visible to user-space via kvm_exit trace events
      Also update to "esr_class_str" for ESR_ELx_EC_PAC, by which we can
      get more readable debug info.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      6701c619
    • Anders Roxell's avatar
      KVM: arm: vgic-v3: Mark expected switch fall-through · 1a8248c7
      Anders Roxell authored
      
      
      When fall-through warnings was enabled by default the following warnings
      was starting to show up:
      
      ../virt/kvm/arm/hyp/vgic-v3-sr.c: In function ‘__vgic_v3_save_aprs’:
      ../virt/kvm/arm/hyp/vgic-v3-sr.c:351:24: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
         cpu_if->vgic_ap0r[2] = __vgic_v3_read_ap0rn(2);
         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
      ../virt/kvm/arm/hyp/vgic-v3-sr.c:352:2: note: here
        case 6:
        ^~~~
      ../virt/kvm/arm/hyp/vgic-v3-sr.c:353:24: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
         cpu_if->vgic_ap0r[1] = __vgic_v3_read_ap0rn(1);
         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
      ../virt/kvm/arm/hyp/vgic-v3-sr.c:354:2: note: here
        default:
        ^~~~~~~
      
      Rework so that the compiler doesn't warn about fall-through.
      
      Fixes: d93512ef0f0e ("Makefile: Globally enable fall-through warning")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      1a8248c7
    • Anders Roxell's avatar
      arm64: KVM: regmap: Fix unexpected switch fall-through · 3d584a3c
      Anders Roxell authored
      When fall-through warnings was enabled by default, commit d93512ef0f0e
      ("Makefile: Globally enable fall-through warning"), the following
      warnings was starting to show up:
      
      In file included from ../arch/arm64/include/asm/kvm_emulate.h:19,
                       from ../arch/arm64/kvm/regmap.c:13:
      ../arch/arm64/kvm/regmap.c: In function ‘vcpu_write_spsr32’:
      ../arch/arm64/include/asm/kvm_hyp.h:31:3: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
         asm volatile(ALTERNATIVE(__msr_s(r##nvh, "%x0"), \
         ^~~
      ../arch/arm64/include/asm/kvm_hyp.h:46:31: note: in expansion of macro ‘write_sysreg_elx’
       #define write_sysreg_el1(v,r) write_sysreg_elx(v, r, _EL1, _EL12)
                                     ^~~~~~~~~~~~~~~~
      ../arch/arm64/kvm/regmap.c:180:3: note: in expansion of macro ‘write_sysreg_el1’
         write_sysreg_el1(v, SYS_SPSR);
         ^~~~~~~~~~~~~~~~
      ../arch/arm64/kvm/regmap.c:181:2: note: here
        case KVM_SPSR_ABT:
        ^~~~
      In file included from ../arch/arm64/include/asm/cputype.h:132,
                       from ../arch/arm64/include/asm/cache.h:8,
                       from ../include/linux/cache.h:6,
                       from ../include/linux/printk.h:9,
                       from ../include/linux/kernel.h:15,
                       from ../include/asm-generic/bug.h:18,
                       from ../arch/arm64/include/asm/bug.h:26,
                       from ../include/linux/bug.h:5,
                       from ../include/linux/mmdebug.h:5,
                       from ../include/linux/mm.h:9,
                       from ../arch/arm64/kvm/regmap.c:11:
      ../arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may fall
       through [-Wimplicit-fallthrough=]
        asm volatile("msr " __stringify(r) ", %x0"  \
        ^~~
      ../arch/arm64/kvm/regmap.c:182:3: note: in expansion of macro ‘write_sysreg’
         write_sysreg(v, spsr_abt);
         ^~~~~~~~~~~~
      ../arch/arm64/kvm/regmap.c:183:2: note: here
        case KVM_SPSR_UND:
        ^~~~
      
      Rework to add a 'break;' in the swich-case since it didn't have that,
      leading to an interresting set of bugs.
      
      Cc: stable@vger.kernel.org # v4.17+
      Fixes: a8928195
      
       ("KVM: arm64: Prepare to handle deferred save/restore of 32-bit registers")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      [maz: reworked commit message, fixed stable range]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      3d584a3c
  8. Jul 24, 2019
    • Wanpeng Li's avatar
      KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption · 266e85a5
      Wanpeng Li authored
      Commit 11752adb
      
       (locking/pvqspinlock: Implement hybrid PV queued/unfair locks)
      introduces hybrid PV queued/unfair locks
       - queued mode (no starvation)
       - unfair mode (good performance on not heavily contended lock)
      The lock waiter goes into the unfair mode especially in VMs with over-commit
      vCPUs since increaing over-commitment increase the likehood that the queue
      head vCPU may have been preempted and not actively spinning.
      
      However, reschedule queue head vCPU timely to acquire the lock still can get
      better performance than just depending on lock stealing in over-subscribe
      scenario.
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
                   vanilla     boosting    improved
       1VM          23520        25040         6%
       2VM           8000        13600        70%
       3VM           3100         5400        74%
      
      The lock holder vCPU yields to the queue head vCPU when unlock, to boost queue
      head vCPU which is involuntary preemption or the one which is voluntary halt
      due to fail to acquire the lock after a short spin in the guest.
      
      Cc: Waiman Long <longman@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      266e85a5
    • Christoph Hellwig's avatar
      Documentation: move Documentation/virtual to Documentation/virt · 2f5947df
      Christoph Hellwig authored
      Renaming docs seems to be en vogue at the moment, so fix on of the
      grossly misnamed directories.  We usually never use "virtual" as
      a shortcut for virtualization in the kernel, but always virt,
      as seen in the virt/ top-level directory.  Fix up the documentation
      to match that.
      
      Fixes: ed16648e
      
       ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2f5947df
  9. Jul 23, 2019
  10. Jul 22, 2019