Skip to content
  1. May 22, 2014
    • Paolo Bonzini's avatar
      KVM: x86: get CPL from SS.DPL · ae9fedc7
      Paolo Bonzini authored
      
      
      CS.RPL is not equal to the CPL in the few instructions between
      setting CR0.PE and reloading CS.  And CS.DPL is also not equal
      to the CPL for conforming code segments.
      
      However, SS.DPL *is* always equal to the CPL except for the weird
      case of SYSRET on AMD processors, which sets SS.DPL=SS.RPL from the
      value in the STAR MSR, but force CPL=3 (Intel instead forces
      SS.DPL=SS.RPL=CPL=3).
      
      So this patch:
      
      - modifies SVM to update the CPL from SS.DPL rather than CS.RPL;
      the above case with SYSRET is not broken further, and the way
      to fix it would be to pass the CPL to userspace and back
      
      - modifies VMX to always return the CPL from SS.DPL (except
      forcing it to 0 if we are emulating real mode via vm86 mode;
      in vm86 mode all DPLs have to be 3, but real mode does allow
      privileged instructions).  It also removes the CPL cache,
      which becomes a duplicate of the SS access rights cache.
      
      This fixes doing KVM_IOCTL_SET_SREGS exactly after setting
      CR0.PE=1 but before CS has been reloaded.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ae9fedc7
    • Paolo Bonzini's avatar
      KVM: x86: check CS.DPL against RPL during task switch · 5045b468
      Paolo Bonzini authored
      
      
      Table 7-1 of the SDM mentions a check that the code segment's
      DPL must match the selector's RPL.  This was not done by KVM,
      fix it.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5045b468
    • Paolo Bonzini's avatar
      KVM: x86: drop set_rflags callback · fb5e336b
      Paolo Bonzini authored
      
      
      Not needed anymore now that the CPL is computed directly
      during task switch.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fb5e336b
    • Paolo Bonzini's avatar
      KVM: x86: use new CS.RPL as CPL during task switch · 2356aaeb
      Paolo Bonzini authored
      
      
      During task switch, all of CS.DPL, CS.RPL, SS.DPL must match (in addition
      to all the other requirements) and will be the new CPL.  So far this
      worked by carefully setting the CS selector and flag before doing the
      task switch; setting CS.selector will already change the CPL.
      
      However, this will not work once we get the CPL from SS.DPL, because
      then you will have to set the full segment descriptor cache to change
      the CPL.  ctxt->ops->cpl(ctxt) will then return the old CPL during the
      task switch, and the check that SS.DPL == CPL will fail.
      
      Temporarily assume that the CPL comes from CS.RPL during task switch
      to a protected-mode task.  This is the same approach used in QEMU's
      emulation code, which (until version 2.0) manually tracks the CPL.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2356aaeb
  2. May 17, 2014
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-20140516' of... · afa538f0
      Paolo Bonzini authored
      Merge tag 'kvm-s390-20140516' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next
      
      1. Correct locking for lazy storage key handling
         A test loop with multiple CPUs triggered a race in the lazy storage
         key handling as introduced by commit 934bc131
         (KVM: s390: Allow skeys to be enabled for the current process). This
         race should not happen with Linux guests, but let's fix it anyway.
         Patch touches !/kvm/ code, but is from the s390 maintainer.
      
      2. Better handling of broken guests
         If we detect a program check loop we stop the guest instead of
         wasting CPU cycles.
      
      3. Better handling on MVPG emulation
         The move page handling is improved to be architecturally correct.
      
      3. Trace point rework
         Let's rework the kvm trace points to have a common header file (for
         later perf usage) and provided a table based instruction decoder.
      
      4. Interpretive execution of SIGP external call
         Let the hardware handle most cases of SIGP external call (IPI) and
         wire up the fixup code for the corner cases.
      
      5. Initial preparations for the IBC facility
         Prepare the code to handle instruction blocking
      afa538f0
  3. May 16, 2014
  4. May 13, 2014
  5. May 08, 2014
    • Gabriel L. Somlo's avatar
      kvm: x86: emulate monitor and mwait instructions as nop · 87c00572
      Gabriel L. Somlo authored
      
      
      Treat monitor and mwait instructions as nop, which is architecturally
      correct (but inefficient) behavior. We do this to prevent misbehaving
      guests (e.g. OS X <= 10.7) from crashing after they fail to check for
      monitor/mwait availability via cpuid.
      
      Since mwait-based idle loops relying on these nop-emulated instructions
      would keep the host CPU pegged at 100%, do NOT advertise their presence
      via cpuid, to prevent compliant guests from using them inadvertently.
      
      Signed-off-by: default avatarGabriel L. Somlo <somlo@cmu.edu>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      87c00572
    • Michael S. Tsirkin's avatar
      kvm/x86: implement hv EOI assist · b63cf42f
      Michael S. Tsirkin authored
      
      
      It seems that it's easy to implement the EOI assist
      on top of the PV EOI feature: simply convert the
      page address to the format expected by PV EOI.
      
      Notes:
      -"No EOI required" is set only if interrupt injected
       is edge triggered; this is true because level interrupts are going
       through IOAPIC which disables PV EOI.
       In any case, if guest triggers EOI the bit will get cleared on exit.
      -For migration, set of HV_X64_MSR_APIC_ASSIST_PAGE sets
       KVM_PV_EOI_EN internally, so restoring HV_X64_MSR_APIC_ASSIST_PAGE
       seems sufficient
       In any case, bit is cleared on exit so worst case it's never re-enabled
      -no handling of PV EOI data is performed at HV_X64_MSR_EOI write;
       HV_X64_MSR_EOI is a separate optimization - it's an X2APIC
       replacement that lets you do EOI with an MSR and not IO.
      
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b63cf42f
  6. May 07, 2014
  7. May 06, 2014
  8. May 05, 2014
    • Christian Borntraeger's avatar
      kvm/irqchip: Speed up KVM_SET_GSI_ROUTING · 719d93cd
      Christian Borntraeger authored
      
      
      When starting lots of dataplane devices the bootup takes very long on
      Christian's s390 with irqfd patches. With larger setups he is even
      able to trigger some timeouts in some components.  Turns out that the
      KVM_SET_GSI_ROUTING ioctl takes very long (strace claims up to 0.1 sec)
      when having multiple CPUs.  This is caused by the  synchronize_rcu and
      the HZ=100 of s390.  By changing the code to use a private srcu we can
      speed things up.  This patch reduces the boot time till mounting root
      from 8 to 2 seconds on my s390 guest with 100 disks.
      
      Uses of hlist_for_each_entry_rcu, hlist_add_head_rcu, hlist_del_init_rcu
      are fine because they do not have lockdep checks (hlist_for_each_entry_rcu
      uses rcu_dereference_raw rather than rcu_dereference, and write-sides
      do not do rcu lockdep at all).
      
      Note that we're hardly relying on the "sleepable" part of srcu.  We just
      want SRCU's faster detection of grace periods.
      
      Testing was done by Andrew Theurer using netperf tests STREAM, MAERTS
      and RR.  The difference between results "before" and "after" the patch
      has mean -0.2% and standard deviation 0.6%.  Using a paired t-test on the
      data points says that there is a 2.5% probability that the patch is the
      cause of the performance difference (rather than a random fluctuation).
      
      (Restricting the t-test to RR, which is the most likely to be affected,
      changes the numbers to respectively -0.3% mean, 0.7% stdev, and 8%
      probability that the numbers actually say something about the patch.
      The probability increases mostly because there are fewer data points).
      
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      719d93cd
  9. Apr 30, 2014
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-20140429' of... · 57b5981c
      Paolo Bonzini authored
      Merge tag 'kvm-s390-20140429' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next
      
      1. Guest handling fixes
      The handling of MVPG, PFMF and Test Block is fixed to better follow
      the architecture. None of these fixes is critical for any current
      Linux guests, but let's play safe.
      
      2. Optimization for single CPU guests
      We can enable the IBS facility if only one VCPU is running (!STOPPED
      state). We also enable this optimization for guest > 1 VCPU as soon
      as all but one VCPU is in stopped state. Thus will help guests that
      have tools like cpuplugd (from s390-utils) that do dynamic offline/
      online of CPUs.
      
      3. NOTES
      There is one non-s390 change in include/linux/kvm_host.h that
      introduces 2 defines for VCPU requests:
      define KVM_REQ_ENABLE_IBS        23
      define KVM_REQ_DISABLE_IBS       24
      57b5981c
  10. Apr 29, 2014