Skip to content
  1. Mar 02, 2020
  2. Feb 28, 2020
  3. Feb 25, 2020
  4. Feb 23, 2020
  5. Feb 22, 2020
    • James Morse's avatar
      arm64: Ask the compiler to __always_inline functions used by KVM at HYP · e43f1331
      James Morse authored
      
      
      KVM uses some of the static-inline helpers like icache_is_vipt() from
      its HYP code. This assumes the function is inlined so that the code is
      mapped to EL2. The compiler may decide not to inline these, and the
      out-of-line version may not be in the __hyp_text section.
      
      Add the additional __always_ hint to these static-inlines that are used
      by KVM.
      
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20200220165839.256881-4-james.morse@arm.com
      e43f1331
    • James Morse's avatar
      KVM: arm64: Define our own swab32() to avoid a uapi static inline · 8c2d146e
      James Morse authored
      
      
      KVM uses swab32() when mediating GIC MMIO accesses if the GICV is badly
      aligned, and the host and guest differ in endianness.
      
      arm64 doesn't provide a __arch_swab32(), so __fswab32() is always backed
      by the macro implementation that the compiler reduces to a single
      instruction. But the static-inline causes problems for KVM if the compiler
      chooses not to inline this function, it may not be located in the
      __hyp_text where __vgic_v2_perform_cpuif_access() needs it.
      
      Create our own __kvm_swab32() macro that calls ___constant_swab32()
      directly. This way we know it will always be inlined.
      
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20200220165839.256881-3-james.morse@arm.com
      8c2d146e
    • James Morse's avatar
      KVM: arm64: Ask the compiler to __always_inline functions used at HYP · 5c37f1ae
      James Morse authored
      
      
      On non VHE CPUs, KVM's __hyp_text contains code run at EL2 while the rest
      of the kernel runs at EL1. This code lives in its own section with start
      and end markers so we can map it to EL2.
      
      The compiler may decide not to inline static-inline functions from the
      header file. It may also decide not to put these out-of-line functions
      in the same section, meaning they aren't mapped when called at EL2.
      
      Clang-9 does exactly this with __kern_hyp_va() and a few others when
      x18 is reserved for the shadow call stack. Add the additional __always_
      hint to all the static-inlines that are called from a hyp file.
      
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20200220165839.256881-2-james.morse@arm.com
      
      ----
      kvm_get_hyp_vector() pulls in all the regular per-cpu accessors
      and this_cpu_has_cap(), fortunately its only called for VHE.
      5c37f1ae
    • Miaohe Lin's avatar
      KVM: SVM: Fix potential memory leak in svm_cpu_init() · d80b64ff
      Miaohe Lin authored
      
      
      When kmalloc memory for sd->sev_vmcbs failed, we forget to free the page
      held by sd->save_area. Also get rid of the var r as '-ENOMEM' is actually
      the only possible outcome here.
      
      Reviewed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d80b64ff
    • Miaohe Lin's avatar
      KVM: apic: avoid calculating pending eoi from an uninitialized val · 23520b2d
      Miaohe Lin authored
      
      
      When pv_eoi_get_user() fails, 'val' may remain uninitialized and the return
      value of pv_eoi_get_pending() becomes random. Fix the issue by initializing
      the variable.
      
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      23520b2d
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when... · a4443267
      Vitaly Kuznetsov authored
      
      KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled
      
      When apicv is disabled on a vCPU (e.g. by enabling KVM_CAP_HYPERV_SYNIC*),
      nothing happens to VMX MSRs on the already existing vCPUs, however, all new
      ones are created with PIN_BASED_POSTED_INTR filtered out. This is very
      confusing and results in the following picture inside the guest:
      
      $ rdmsr -ax 0x48d
      ff00000016
      7f00000016
      7f00000016
      7f00000016
      
      This is observed with QEMU and 4-vCPU guest: QEMU creates vCPU0, does
      KVM_CAP_HYPERV_SYNIC2 and then creates the remaining three.
      
      L1 hypervisor may only check CPU0's controls to find out what features
      are available and it will be very confused later. Switch to setting
      PIN_BASED_POSTED_INTR control based on global 'enable_apicv' setting.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a4443267
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 · 91a5f413
      Vitaly Kuznetsov authored
      
      
      Even when APICv is disabled for L1 it can (and, actually, is) still
      available for L2, this means we need to always call
      vmx_deliver_nested_posted_interrupt() when attempting an interrupt
      delivery.
      
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      91a5f413
    • Suravee Suthikulpanit's avatar
      kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled · 93fd9666
      Suravee Suthikulpanit authored
      Launching VM w/ AVIC disabled together with pass-through device
      results in NULL pointer dereference bug with the following call trace.
      
          RIP: 0010:svm_refresh_apicv_exec_ctrl+0x17e/0x1a0 [kvm_amd]
      
          Call Trace:
           kvm_vcpu_update_apicv+0x44/0x60 [kvm]
           kvm_arch_vcpu_ioctl_run+0x3f4/0x1c80 [kvm]
           kvm_vcpu_ioctl+0x3d8/0x650 [kvm]
           do_vfs_ioctl+0xaa/0x660
           ? tomoyo_file_ioctl+0x19/0x20
           ksys_ioctl+0x67/0x90
           __x64_sys_ioctl+0x1a/0x20
           do_syscall_64+0x57/0x190
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Investigation shows that this is due to the uninitialized usage of
      struct vapu_svm.ir_list in the svm_set_pi_irte_mode(), which is
      called from svm_refresh_apicv_exec_ctrl().
      
      The ir_list is initialized only if AVIC is enabled. So, fixes by
      adding a check if AVIC is enabled in the svm_refresh_apicv_exec_ctrl().
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206579
      Fixes: 8937d762
      
       ("kvm: x86: svm: Add support to (de)activate posted interrupts.")
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Tested-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      93fd9666
    • Xiaoyao Li's avatar
      KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE · 624e18f9
      Xiaoyao Li authored
      Commit 15934878
      
       ("x86/vmx: Introduce VMX_FEATURES_*") missed
      bit 26 (enable user wait and pause) of Secondary Processor-based
      VM-Execution Controls.
      
      Add VMX_FEATURE_USR_WAIT_PAUSE flag so that it shows up in /proc/cpuinfo,
      and use it to define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE to make them
      uniform.
      
      Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      624e18f9
    • wanpeng li's avatar
      KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow · c9dfd3fb
      wanpeng li authored
      
      
      For the duration of mapping eVMCS, it derefences ->memslots without holding
      ->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by
      moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU
      is already taken.
      
      It can be reproduced by running kvm's evmcs_test selftest.
      
        =============================
        warning: suspicious rcu usage
        5.6.0-rc1+ #53 tainted: g        w ioe
        -----------------------------
        ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
         rcu_scheduler_active = 2, debug_locks = 1
        1 lock held by evmcs_test/8507:
         #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at:
      kvm_vcpu_ioctl+0x85/0x680 [kvm]
      
        stack backtrace:
        cpu: 6 pid: 8507 comm: evmcs_test tainted: g        w ioe     5.6.0-rc1+ #53
        hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016
        call trace:
         dump_stack+0x68/0x9b
         kvm_read_guest_cached+0x11d/0x150 [kvm]
         kvm_hv_get_assist_page+0x33/0x40 [kvm]
         nested_enlightened_vmentry+0x2c/0x60 [kvm_intel]
         nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel]
         nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel]
         vmx_vcpu_run+0x67a/0xe60 [kvm_intel]
         vcpu_enter_guest+0x35e/0x1bc0 [kvm]
         kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm]
         kvm_vcpu_ioctl+0x370/0x680 [kvm]
         ksys_ioctl+0x235/0x850
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x77/0x780
         entry_syscall_64_after_hwframe+0x49/0xbe
      
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9dfd3fb
    • Miaohe Lin's avatar
      KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI · 7455a832
      Miaohe Lin authored
      Commit 13db7734 ("KVM: x86: don't notify userspace IOAPIC on edge
      EOI") said, edge-triggered interrupts don't set a bit in TMR, which means
      that IOAPIC isn't notified on EOI. And var level indicates level-triggered
      interrupt.
      But commit 3159d36a ("KVM: x86: use generic function for MSI parsing")
      replace var level with irq.level by mistake. Fix it by changing irq.level
      to irq.trig_mode.
      
      Cc: stable@vger.kernel.org
      Fixes: 3159d36a
      
       ("KVM: x86: use generic function for MSI parsing")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7455a832
  6. Feb 21, 2020
  7. Feb 17, 2020
  8. Feb 13, 2020