Skip to content
  1. Mar 17, 2014
  2. Mar 14, 2014
  3. Mar 13, 2014
    • Gabriel L. Somlo's avatar
      kvm: x86: ignore ioapic polarity · 100943c5
      Gabriel L. Somlo authored
      
      
      Both QEMU and KVM have already accumulated a significant number of
      optimizations based on the hard-coded assumption that ioapic polarity
      will always use the ActiveHigh convention, where the logical and
      physical states of level-triggered irq lines always match (i.e.,
      active(asserted) == high == 1, inactive == low == 0). QEMU guests
      are expected to follow directions given via ACPI and configure the
      ioapic with polarity 0 (ActiveHigh). However, even when misbehaving
      guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow),
      QEMU will still use the ActiveHigh signaling convention when
      interfacing with KVM.
      
      This patch modifies KVM to completely ignore ioapic polarity as set by
      the guest OS, enabling misbehaving guests to work alongside those which
      comply with the ActiveHigh polarity specified by QEMU's ACPI tables.
      
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGabriel L. Somlo <somlo@cmu.edu>
      [Move documentation to KVM_IRQ_LINE, add ia64. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      100943c5
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix register usage when loading/saving VRSAVE · e724f080
      Paul Mackerras authored
      Commit 595e4f7e
      
       ("KVM: PPC: Book3S HV: Use load/store_fp_state
      functions in HV guest entry/exit") changed the register usage in
      kvmppc_save_fp() and kvmppc_load_fp() but omitted changing the
      instructions that load and save VRSAVE.  The result is that the
      VRSAVE value was loaded from a constant address, and saved to a
      location past the end of the vcpu struct, causing host kernel
      memory corruption and various kinds of host kernel crashes.
      
      This fixes the problem by using register r31, which contains the
      vcpu pointer, instead of r3 and r4.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e724f080
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Remove bogus duplicate code · a5b0ccb0
      Paul Mackerras authored
      Commit 7b490411
      
       ("KVM: PPC: Book3S HV: Add new state for
      transactional memory") incorrectly added some duplicate code to the
      guest exit path because I didn't manage to clean up after a rebase
      correctly.  This removes the extraneous material.  The presence of
      this extraneous code causes host crashes whenever a guest is run.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a5b0ccb0
  4. Mar 11, 2014
    • Paolo Bonzini's avatar
      KVM: svm: Allow the guest to run with dirty debug registers · facb0139
      Paolo Bonzini authored
      
      
      When not running in guest-debug mode (i.e. the guest controls the debug
      registers, having to take an exit for each DR access is a waste of time.
      If the guest gets into a state where each context switch causes DR to be
      saved and restored, this can take away as much as 40% of the execution
      time from the guest.
      
      If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we
      can let it write freely to the debug registers and reload them on the
      next exit.  We still need to exit on the first access, so that the
      KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further
      accesses to the debug registers will not cause a vmexit.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      facb0139
    • Paolo Bonzini's avatar
      KVM: svm: set/clear all DR intercepts in one swoop · 5315c716
      Paolo Bonzini authored
      
      
      Unlike other intercepts, debug register intercepts will be modified
      in hot paths if the guest OS is bad or otherwise gets tricked into
      doing so.
      
      Avoid calling recalc_intercepts 16 times for debug registers.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5315c716
    • Paolo Bonzini's avatar
      KVM: nVMX: Allow nested guests to run with dirty debug registers · d16c293e
      Paolo Bonzini authored
      
      
      When preparing the VMCS02, the CPU-based execution controls is computed
      by vmx_exec_control.  Turn off DR access exits there, too, if the
      KVM_DEBUGREG_WONT_EXIT bit is set in switch_db_regs.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d16c293e
    • Paolo Bonzini's avatar
      KVM: vmx: Allow the guest to run with dirty debug registers · 81908bf4
      Paolo Bonzini authored
      
      
      When not running in guest-debug mode (i.e. the guest controls the debug
      registers, having to take an exit for each DR access is a waste of time.
      If the guest gets into a state where each context switch causes DR to be
      saved and restored, this can take away as much as 40% of the execution
      time from the guest.
      
      If the guest is running with vcpu->arch.db == vcpu->arch.eff_db, we
      can let it write freely to the debug registers and reload them on the
      next exit.  We still need to exit on the first access, so that the
      KVM_DEBUGREG_WONT_EXIT flag is set in switch_db_regs; after that, further
      accesses to the debug registers will not cause a vmexit.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      81908bf4
    • Paolo Bonzini's avatar
      KVM: x86: Allow the guest to run with dirty debug registers · c77fb5fe
      Paolo Bonzini authored
      
      
      When not running in guest-debug mode, the guest controls the debug
      registers and having to take an exit for each DR access is a waste
      of time.  If the guest gets into a state where each context switch
      causes DR to be saved and restored, this can take away as much as 40%
      of the execution time from the guest.
      
      After this patch, VMX- and SVM-specific code can set a flag in
      switch_db_regs, telling vcpu_enter_guest that on the next exit the debug
      registers might be dirty and need to be reloaded (syncing will be taken
      care of by a new callback in kvm_x86_ops).  This flag can be set on the
      first access to a debug registers, so that multiple accesses to the
      debug registers only cause one vmexit.
      
      Note that since the guest will be able to read debug registers and
      enable breakpoints in DR7, we need to ensure that they are synchronized
      on entry to the guest---including DR6 that was not synced before.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c77fb5fe
    • Paolo Bonzini's avatar
      KVM: x86: change vcpu->arch.switch_db_regs to a bit mask · 360b948d
      Paolo Bonzini authored
      
      
      The next patch will add another bit that we can test with the
      same "if".
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      360b948d
    • Paolo Bonzini's avatar
      KVM: vmx: we do rely on loading DR7 on entry · c845f9c6
      Paolo Bonzini authored
      
      
      Currently, this works even if the bit is not in "min", because the bit is always
      set in MSR_IA32_VMX_ENTRY_CTLS.  Mention it for the sake of documentation, and
      to avoid surprises if we later switch to MSR_IA32_VMX_TRUE_ENTRY_CTLS.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c845f9c6
    • Jan Kiszka's avatar
      KVM: x86: Remove return code from enable_irq/nmi_window · c9a7953f
      Jan Kiszka authored
      
      
      It's no longer possible to enter enable_irq_window in guest mode when
      L1 intercepts external interrupts and we are entering L2. This is now
      caught in vcpu_enter_guest. So we can remove the check from the VMX
      version of enable_irq_window, thus the need to return an error code from
      both enable_irq_window and enable_nmi_window.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9a7953f
    • Jan Kiszka's avatar
      KVM: nVMX: Do not inject NMI vmexits when L2 has a pending interrupt · 220c5672
      Jan Kiszka authored
      
      
      According to SDM 27.2.3, IDT vectoring information will not be valid on
      vmexits caused by external NMIs. So we have to avoid creating such
      scenarios by delaying EXIT_REASON_EXCEPTION_NMI injection as long as we
      have a pending interrupt because that one would be migrated to L1's IDT
      vectoring info on nested exit.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      220c5672
    • Jan Kiszka's avatar
      KVM: nVMX: Fully emulate preemption timer · f4124500
      Jan Kiszka authored
      
      
      We cannot rely on the hardware-provided preemption timer support because
      we are holding L2 in HLT outside non-root mode. Furthermore, emulating
      the preemption will resolve tick rate errata on older Intel CPUs.
      
      The emulation is based on hrtimer which is started on L2 entry, stopped
      on L2 exit and evaluated via the new check_nested_events hook. As we no
      longer rely on hardware features, we can enable both the preemption
      timer support and value saving unconditionally.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f4124500
    • Jan Kiszka's avatar
      KVM: nVMX: Rework interception of IRQs and NMIs · b6b8a145
      Jan Kiszka authored
      
      
      Move the check for leaving L2 on pending and intercepted IRQs or NMIs
      from the *_allowed handler into a dedicated callback. Invoke this
      callback at the relevant points before KVM checks if IRQs/NMIs can be
      injected. The callback has the task to switch from L2 to L1 if needed
      and inject the proper vmexit events.
      
      The rework fixes L2 wakeups from HLT and provides the foundation for
      preemption timer emulation.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b6b8a145
  5. Mar 06, 2014
  6. Mar 04, 2014
  7. Mar 03, 2014