Skip to content
  1. Jun 04, 2015
  2. Jun 03, 2015
  3. Jun 02, 2015
  4. May 29, 2015
    • Marcelo Tosatti's avatar
      KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR · b7e60c5a
      Marcelo Tosatti authored
      
      
      Initialize kvmclock base, on kvmclock system MSR write time,
      so that the guest sees kvmclock counting from zero.
      
      This matches baremetal behaviour when kvmclock in guest
      sets sched clock stable.
      
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      [Remove unnecessary comment. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b7e60c5a
    • Luiz Capitulino's avatar
      x86: kvmclock: set scheduler clock stable · 0ad83caa
      Luiz Capitulino authored
      
      
      If you try to enable NOHZ_FULL on a guest today, you'll get
      the following error when the guest tries to deactivate the
      scheduler tick:
      
       WARNING: CPU: 3 PID: 2182 at kernel/time/tick-sched.c:192 can_stop_full_tick+0xb9/0x290()
       NO_HZ FULL will not work with unstable sched clock
       CPU: 3 PID: 2182 Comm: kworker/3:1 Not tainted 4.0.0-10545-gb9bb6fb #204
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       Workqueue: events flush_to_ldisc
        ffffffff8162a0c7 ffff88011f583e88 ffffffff814e6ba0 0000000000000002
        ffff88011f583ed8 ffff88011f583ec8 ffffffff8104d095 ffff88011f583eb8
        0000000000000000 0000000000000003 0000000000000001 0000000000000001
       Call Trace:
        <IRQ>  [<ffffffff814e6ba0>] dump_stack+0x4f/0x7b
        [<ffffffff8104d095>] warn_slowpath_common+0x85/0xc0
        [<ffffffff8104d146>] warn_slowpath_fmt+0x46/0x50
        [<ffffffff810bd2a9>] can_stop_full_tick+0xb9/0x290
        [<ffffffff810bd9ed>] tick_nohz_irq_exit+0x8d/0xb0
        [<ffffffff810511c5>] irq_exit+0xc5/0x130
        [<ffffffff814f180a>] smp_apic_timer_interrupt+0x4a/0x60
        [<ffffffff814eff5e>] apic_timer_interrupt+0x6e/0x80
        <EOI>  [<ffffffff814ee5d1>] ? _raw_spin_unlock_irqrestore+0x31/0x60
        [<ffffffff8108bbc8>] __wake_up+0x48/0x60
        [<ffffffff8134836c>] n_tty_receive_buf_common+0x49c/0xba0
        [<ffffffff8134a6bf>] ? tty_ldisc_ref+0x1f/0x70
        [<ffffffff81348a84>] n_tty_receive_buf2+0x14/0x20
        [<ffffffff8134b390>] flush_to_ldisc+0xe0/0x120
        [<ffffffff81064d05>] process_one_work+0x1d5/0x540
        [<ffffffff81064c81>] ? process_one_work+0x151/0x540
        [<ffffffff81065191>] worker_thread+0x121/0x470
        [<ffffffff81065070>] ? process_one_work+0x540/0x540
        [<ffffffff8106b4df>] kthread+0xef/0x110
        [<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0
        [<ffffffff814ef4f2>] ret_from_fork+0x42/0x70
        [<ffffffff8106b3f0>] ? __kthread_parkme+0xa0/0xa0
       ---[ end trace 06e3507544a38866 ]---
      
      However, it turns out that kvmclock does provide a stable
      sched_clock callback. So, let the scheduler know this which
      in turn makes NOHZ_FULL work in the guest.
      
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0ad83caa
    • Marcelo Tosatti's avatar
      x86: kvmclock: add flag to indicate pvclock counts from zero · 61191725
      Marcelo Tosatti authored
      
      
      Setting sched clock stable for kvmclock causes the printk timestamps
      to not start from zero, which is different from baremetal and
      can possibly break userspace. Add a flag to indicate that
      hypervisor sets clock base at zero when kvmclock is initialized.
      
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      61191725
  5. May 28, 2015
  6. May 26, 2015
  7. May 20, 2015
    • Paolo Bonzini's avatar
      Merge branch 'kvm-master' into kvm-next · a9b4fb7e
      Paolo Bonzini authored
      
      
      Grab MPX bugfix, and fix conflicts against Rik's adaptive FPU
      deactivation patch.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a9b4fb7e
    • Liang Li's avatar
      kvm/fpu: Enable eager restore kvm FPU for MPX · c447e76b
      Liang Li authored
      
      
      The MPX feature requires eager KVM FPU restore support. We have verified
      that MPX cannot work correctly with the current lazy KVM FPU restore
      mechanism. Eager KVM FPU restore should be enabled if the MPX feature is
      exposed to VM.
      
      Signed-off-by: default avatarYang Zhang <yang.z.zhang@intel.com>
      Signed-off-by: default avatarLiang Li <liang.z.li@intel.com>
      [Also activate the FPU on AMD processors. - Paolo]
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c447e76b
    • Paolo Bonzini's avatar
      Revert "KVM: x86: drop fpu_activate hook" · 0fdd74f7
      Paolo Bonzini authored
      This reverts commit 4473b570
      
      .  We'll
      use the hook again.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0fdd74f7
    • Andrea Arcangeli's avatar
      kvm: fix crash in kvm_vcpu_reload_apic_access_page · e8fd5e9e
      Andrea Arcangeli authored
      
      
      memslot->userfault_addr is set by the kernel with a mmap executed
      from the kernel but the userland can still munmap it and lead to the
      below oops after memslot->userfault_addr points to a host virtual
      address that has no vma or mapping.
      
      [  327.538306] BUG: unable to handle kernel paging request at fffffffffffffffe
      [  327.538407] IP: [<ffffffff811a7b55>] put_page+0x5/0x50
      [  327.538474] PGD 1a01067 PUD 1a03067 PMD 0
      [  327.538529] Oops: 0000 [#1] SMP
      [  327.538574] Modules linked in: macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT iptable_filter ip_tables tun bridge stp llc rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ipmi_devintf iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp dcdbas intel_rapl kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr sb_edac edac_core ipmi_si ipmi_msghandler acpi_pad wmi acpi_power_meter lpc_ich mfd_core mei_me
      [  327.539488]  mei shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc mlx4_ib ib_sa ib_mad ib_core mlx4_en vxlan ib_addr ip_tunnel xfs libcrc32c sd_mod crc_t10dif crct10dif_common crc32c_intel mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm ahci i2c_core libahci mlx4_core libata tg3 ptp pps_core megaraid_sas ntb dm_mirror dm_region_hash dm_log dm_mod
      [  327.539956] CPU: 3 PID: 3161 Comm: qemu-kvm Not tainted 3.10.0-240.el7.userfault19.4ca4011.x86_64.debug #1
      [  327.540045] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.1.2 01/20/2014
      [  327.540115] task: ffff8803280ccf00 ti: ffff880317c58000 task.ti: ffff880317c58000
      [  327.540184] RIP: 0010:[<ffffffff811a7b55>]  [<ffffffff811a7b55>] put_page+0x5/0x50
      [  327.540261] RSP: 0018:ffff880317c5bcf8  EFLAGS: 00010246
      [  327.540313] RAX: 00057ffffffff000 RBX: ffff880616a20000 RCX: 0000000000000000
      [  327.540379] RDX: 0000000000002014 RSI: 00057ffffffff000 RDI: fffffffffffffffe
      [  327.540445] RBP: ffff880317c5bd10 R08: 0000000000000103 R09: 0000000000000000
      [  327.540511] R10: 0000000000000000 R11: 0000000000000000 R12: fffffffffffffffe
      [  327.540576] R13: 0000000000000000 R14: ffff880317c5bd70 R15: ffff880317c5bd50
      [  327.540643] FS:  00007fd230b7f700(0000) GS:ffff880630800000(0000) knlGS:0000000000000000
      [  327.540717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  327.540771] CR2: fffffffffffffffe CR3: 000000062a2c3000 CR4: 00000000000427e0
      [  327.540837] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  327.540904] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  327.540974] Stack:
      [  327.541008]  ffffffffa05d6d0c ffff880616a20000 0000000000000000 ffff880317c5bdc0
      [  327.541093]  ffffffffa05ddaa2 0000000000000000 00000000002191bf 00000042f3feab2d
      [  327.541177]  00000042f3feab2d 0000000000000002 0000000000000001 0321000000000000
      [  327.541261] Call Trace:
      [  327.541321]  [<ffffffffa05d6d0c>] ? kvm_vcpu_reload_apic_access_page+0x6c/0x80 [kvm]
      [  327.543615]  [<ffffffffa05ddaa2>] vcpu_enter_guest+0x3f2/0x10f0 [kvm]
      [  327.545918]  [<ffffffffa05e2f10>] kvm_arch_vcpu_ioctl_run+0x2b0/0x5a0 [kvm]
      [  327.548211]  [<ffffffffa05e2d02>] ? kvm_arch_vcpu_ioctl_run+0xa2/0x5a0 [kvm]
      [  327.550500]  [<ffffffffa05ca845>] kvm_vcpu_ioctl+0x2b5/0x680 [kvm]
      [  327.552768]  [<ffffffff810b8d12>] ? creds_are_invalid.part.1+0x12/0x50
      [  327.555069]  [<ffffffff810b8d71>] ? creds_are_invalid+0x21/0x30
      [  327.557373]  [<ffffffff812d6066>] ? inode_has_perm.isra.49.constprop.65+0x26/0x80
      [  327.559663]  [<ffffffff8122d985>] do_vfs_ioctl+0x305/0x530
      [  327.561917]  [<ffffffff8122dc51>] SyS_ioctl+0xa1/0xc0
      [  327.564185]  [<ffffffff816de829>] system_call_fastpath+0x16/0x1b
      [  327.566480] Code: 0b 31 f6 4c 89 e7 e8 4b 7f ff ff 0f 0b e8 24 fd ff ff e9 a9 fd ff ff 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 <48> f7 07 00 c0 00 00 55 48 89 e5 75 2a 8b 47 1c 85 c0 74 1e f0
      
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e8fd5e9e
    • Nicholas Krause's avatar
      kvm: x86: Make functions that have no external callers static · ed3cf152
      Nicholas Krause authored
      
      
      This makes the functions kvm_guest_cpu_init and  kvm_init_debugfs
      static now due to having no external callers outside their
      declarations in the file, kvm.c.
      
      Signed-off-by: default avatarNicholas Krause <xerofoify@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ed3cf152
    • Paolo Bonzini's avatar
      KVM: export __gfn_to_pfn_memslot, drop gfn_to_pfn_async · 3520469d
      Paolo Bonzini authored
      
      
      gfn_to_pfn_async is used in just one place, and because of x86-specific
      treatment that place will need to look at the memory slot.  Hence inline
      it into try_async_pf and export __gfn_to_pfn_memslot.
      
      The patch also switches the subsequent call to gfn_to_pfn_prot to use
      __gfn_to_pfn_memslot.  This is a small optimization.  Finally, remove
      the now-unused async argument of __gfn_to_pfn.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3520469d
    • Paolo Bonzini's avatar
      KVM: mips: use id_to_memslot correctly · 69a12200
      Paolo Bonzini authored
      
      
      The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
      position in the memslots array, which is sorted by gfn.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      69a12200
    • Xiao Guangrong's avatar
      KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed · d81135a5
      Xiao Guangrong authored
      
      
      CR0.CD and CR0.NW are not used by shadow page table so that need
      not adjust mmu if these two bit are changed
      
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d81135a5
    • Xiao Guangrong's avatar
      KVM: MMU: fix MTRR update · efdfe536
      Xiao Guangrong authored
      
      
      Currently, whenever guest MTRR registers are changed
      kvm_mmu_reset_context is called to switch to the new root shadow page
      table, however, it's useless since:
      1) the cache type is not cached into shadow page's attribute so that
         the original root shadow page will be reused
      
      2) the cache type is set on the last spte, that means we should sync
         the last sptes when MTRR is changed
      
      This patch fixs this issue by drop all the spte in the gfn range which
      is being updated by MTRR
      
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      efdfe536
    • Xiao Guangrong's avatar
      KVM: MMU: fix decoding cache type from MTRR · d69afbc6
      Xiao Guangrong authored
      
      
      There are some bugs in current get_mtrr_type();
      1: bit 1 of mtrr_state->enabled is corresponding bit 11 of
         IA32_MTRR_DEF_TYPE MSR which completely control MTRR's enablement
         that means other bits are ignored if it is cleared
      
      2: the fixed MTRR ranges are controlled by bit 0 of
         mtrr_state->enabled (bit 10 of IA32_MTRR_DEF_TYPE)
      
      3: if MTRR is disabled, UC is applied to all of physical memory rather
         than mtrr_state->def_type
      
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d69afbc6