Skip to content
  1. Mar 24, 2009
    • Avi Kivity's avatar
      KVM: Fix vmload and friends misinterpreted as lidt · 2b3d2a20
      Avi Kivity authored
      
      
      The AMD SVM instruction family all overload the 0f 01 /3 opcode, further
      multiplexing on the three r/m bits.  But the code decided that anything that
      isn't a vmmcall must be an lidt (which shares the 0f 01 /3 opcode, for the
      case that mod = 3).
      
      Fix by aborting emulation if this isn't a vmmcall.
      
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      2b3d2a20
    • Avi Kivity's avatar
      KVM: MMU: Initialize a shadow page's global attribute from cr4.pge · e2078318
      Avi Kivity authored
      
      
      If cr4.pge is cleared, we ought to treat any ptes in the page as non-global.
      This allows us to remove the check from set_spte().
      
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      e2078318
    • Avi Kivity's avatar
      KVM: MMU: Segregate mmu pages created with different cr4.pge settings · 2f0b3d60
      Avi Kivity authored
      
      
      Don't allow a vcpu with cr4.pge cleared to use a shadow page created with
      cr4.pge set; this might cause a cr3 switch not to sync ptes that have the
      global bit set (the global bit has no effect if !cr4.pge).
      
      This can only occur on smp with different cr4.pge settings for different
      vcpus (since a cr4 change will resync the shadow ptes), but there's no
      cost to being correct here.
      
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      2f0b3d60
    • Avi Kivity's avatar
      KVM: MMU: Inherit a shadow page's guest level count from vcpu setup · a770f6f2
      Avi Kivity authored
      
      
      Instead of "calculating" it on every shadow page allocation, set it once
      when switching modes, and copy it when allocating pages.
      
      This doesn't buy us much, but sets up the stage for inheriting more
      information related to the mmu setup.
      
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      a770f6f2
    • Xiantao Zhang's avatar
      KVM: ia64: Code cleanup · 22ccb142
      Xiantao Zhang authored
      
      
      Remove some unnecessary blank lines to accord with Kernel's coding style.
      Also remove vcpu_get_itir_on_fault due to no reference to it.
      
      Signed-off-by: default avatarXiantao Zhang <xiantao.zhang@intel.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      22ccb142
    • Jan Kiszka's avatar
      KVM: Remove old kvm_guest_debug structs · 989c0f0e
      Jan Kiszka authored
      
      
      Remove the remaining arch fragments of the old guest debug interface
      that now break non-x86 builds.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      989c0f0e
    • Jes Sorensen's avatar
      KVM: ia64: stack get/restore patch · e9a999fe
      Jes Sorensen authored
      
      
      Implement KVM_IA64_VCPU_[GS]ET_STACK ioctl calls. This is required
      for live migrations.
      
      Patch is based on previous implementation that was part of old
      GET/SET_REGS ioctl calls.
      
      Signed-off-by: default avatarJes Sorensen <jes@sgi.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      e9a999fe
    • Jan Kiszka's avatar
      KVM: x86: Wire-up hardware breakpoints for guest debugging · ae675ef0
      Jan Kiszka authored
      
      
      Add the remaining bits to make use of debug registers also for guest
      debugging, thus enabling the use of hardware breakpoints and
      watchpoints.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      ae675ef0
    • Jan Kiszka's avatar
      KVM: x86: Virtualize debug registers · 42dbaa5a
      Jan Kiszka authored
      
      
      So far KVM only had basic x86 debug register support, once introduced to
      realize guest debugging that way. The guest itself was not able to use
      those registers.
      
      This patch now adds (almost) full support for guest self-debugging via
      hardware registers. It refactors the code, moving generic parts out of
      SVM (VMX was already cleaned up by the KVM_SET_GUEST_DEBUG patches), and
      it ensures that the registers are properly switched between host and
      guest.
      
      This patch also prepares debug register usage by the host. The latter
      will (once wired-up by the following patch) allow for hardware
      breakpoints/watchpoints in guest code. If this is enabled, the guest
      will only see faked debug registers without functionality, but with
      content reflecting the guest's modifications.
      
      Tested on Intel only, but SVM /should/ work as well, but who knows...
      
      Known limitations: Trapping on tss switch won't work - most probably on
      Intel.
      
      Credits also go to Joerg Roedel - I used his once posted debugging
      series as platform for this patch.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      42dbaa5a
    • Jan Kiszka's avatar
      KVM: VMX: Allow single-stepping when uninterruptible · 55934c0b
      Jan Kiszka authored
      
      
      When single-stepping over STI and MOV SS, we must clear the
      corresponding interruptibility bits in the guest state. Otherwise
      vmentry fails as it then expects bit 14 (BS) in pending debug exceptions
      being set, but that's not correct for the guest debugging case.
      
      Note that clearing those bits is safe as we check for interruptibility
      based on the original state and do not inject interrupts or NMIs if
      guest interruptibility was blocked.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      55934c0b
    • Jan Kiszka's avatar
      KVM: New guest debug interface · d0bfb940
      Jan Kiszka authored
      
      
      This rips out the support for KVM_DEBUG_GUEST and introduces a new IOCTL
      instead: KVM_SET_GUEST_DEBUG. The IOCTL payload consists of a generic
      part, controlling the "main switch" and the single-step feature. The
      arch specific part adds an x86 interface for intercepting both types of
      debug exceptions separately and re-injecting them when the host was not
      interested. Moveover, the foundation for guest debugging via debug
      registers is layed.
      
      To signal breakpoint events properly back to userland, an arch-specific
      data block is now returned along KVM_EXIT_DEBUG. For x86, the arch block
      contains the PC, the debug exception, and relevant debug registers to
      tell debug events properly apart.
      
      The availability of this new interface is signaled by
      KVM_CAP_SET_GUEST_DEBUG. Empty stubs for not yet supported archs are
      provided.
      
      Note that both SVM and VTX are supported, but only the latter was tested
      yet. Based on the experience with all those VTX corner case, I would be
      fairly surprised if SVM will work out of the box.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      d0bfb940
    • Jan Kiszka's avatar
      KVM: VMX: Support for injecting software exceptions · 8ab2d2e2
      Jan Kiszka authored
      
      
      VMX differentiates between processor and software generated exceptions
      when injecting them into the guest. Extend vmx_queue_exception
      accordingly (and refactor related constants) so that we can use this
      service reliably for the new guest debugging framework.
      
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      8ab2d2e2
    • Alexander Graf's avatar
      KVM: SVM: Only allow setting of EFER_SVME when CPUID SVM is set · d8017474
      Alexander Graf authored
      
      
      Userspace has to tell the kernel module somehow that nested SVM should be used.
      The easiest way that doesn't break anything I could think of is to implement
      
      if (cpuid & svm)
          allow write to efer
      else
          deny write to efer
      
      Old userspaces mask the SVM capability bit, so they don't break.
      In order to find out that the SVM capability is set, I had to split the
      kvm_emulate_cpuid into a finding and an emulating part.
      
      (introduced in v6)
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      d8017474
    • Alexander Graf's avatar
      KVM: SVM: Allow setting the SVME bit · 236de055
      Alexander Graf authored
      
      
      Normally setting the SVME bit in EFER is not allowed, as we did
      not support SVM. Not since we do, we should also allow enabling
      SVM mode.
      
      v2 comes as last patch, so we don't enable half-ready code
      v4 introduces a module option to enable SVM
      v6 warns that nesting is enabled
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      236de055
    • Joerg Roedel's avatar
      KVM: SVM: Allow read access to MSR_VM_VR · eb6f302e
      Joerg Roedel authored
      
      
      KVM tries to read the VM_CR MSR to find out if SVM was disabled by
      the BIOS. So implement read support for this MSR to make nested
      SVM running.
      
      Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      eb6f302e
    • Alexander Graf's avatar
      KVM: SVM: Add VMEXIT handler and intercepts · cf74a78b
      Alexander Graf authored
      
      
      This adds the #VMEXIT intercept, so we return to the level 1 guest
      when something happens in the level 2 guest that should return to
      the level 1 guest.
      
      v2 implements HIF handling and cleans up exception interception
      v3 adds support for V_INTR_MASKING_MASK
      v4 uses the host page hsave
      v5 removes IOPM merging code
      v6 moves mmu code out of the atomic section
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      cf74a78b
    • Alexander Graf's avatar
      KVM: SVM: Add VMRUN handler · 3d6368ef
      Alexander Graf authored
      
      
      This patch implements VMRUN. VMRUN enters a virtual CPU and runs that
      in the same context as the normal guest CPU would run.
      So basically it is implemented the same way, a normal CPU would do it.
      
      We also prepare all intercepts that get OR'ed with the original
      intercepts, as we do not allow a level 2 guest to be intercepted less
      than the first level guest.
      
      v2 implements the following improvements:
      
      - fixes the CPL check
      - does not allocate iopm when not used
      - remembers the host's IF in the HIF bit in the hflags
      
      v3:
      
      - make use of the new permission checking
      - add support for V_INTR_MASKING_MASK
      
      v4:
      
      - use host page backed hsave
      
      v5:
      
      - remove IOPM merging code
      
      v6:
      
      - save cr4 so PAE l1 guests work
      
      v7:
      
      - return 0 on vmrun so we check the MSRs too
      - fix MSR check to use the correct variable
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      3d6368ef
    • Alexander Graf's avatar
      KVM: SVM: Add VMLOAD and VMSAVE handlers · 5542675b
      Alexander Graf authored
      
      
      This implements the VMLOAD and VMSAVE instructions, that usually surround
      the VMRUN instructions. Both instructions load / restore the same elements,
      so we only need to implement them once.
      
      v2 fixes CPL checking and replaces memcpy by assignments
      v3 makes use of the new permission checking
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      5542675b
    • Alexander Graf's avatar
      KVM: SVM: Implement hsave · b286d5d8
      Alexander Graf authored
      
      
      Implement the hsave MSR, that gives the VCPU a GPA to save the
      old guest state in.
      
      v2 allows userspace to save/restore hsave
      v4 dummys out the hsave MSR, so we use a host page
      v6 remembers the guest's hsave and exports the MSR
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      b286d5d8
    • Alexander Graf's avatar
      KVM: SVM: Implement GIF, clgi and stgi · 1371d904
      Alexander Graf authored
      
      
      This patch implements the GIF flag and the clgi and stgi instructions that
      set this flag. Only if the flag is set (default), interrupts can be received by
      the CPU.
      
      To keep the information about that somewhere, this patch adds a new hidden
      flags vector. that is used to store information that does not go into the
      vmcb, but is SVM specific.
      
      I tried to write some code to make -no-kvm-irqchip work too, but the first
      level guest won't even boot with that atm, so I ditched it.
      
      v2 moves the hflags to x86 generic code
      v3 makes use of the new permission helper
      v6 only enables interrupt_window if GIF=1
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      1371d904
    • Alexander Graf's avatar
      KVM: SVM: Add helper functions for nested SVM · c0725420
      Alexander Graf authored
      
      
      These are helpers for the nested SVM implementation.
      
      - nsvm_printk implements a debug printk variant
      - nested_svm_do calls a handler that can accesses gpa-based memory
      
      v3 makes use of the new permission checker
      v6 changes:
      - streamline nsvm_debug()
      - remove printk(KERN_ERR)
      - SVME check before CPL check
      - give GP error code
      - use new EFER constant
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      c0725420
    • Alexander Graf's avatar
      KVM: SVM: Move EFER and MSR constants to generic x86 code · 9962d032
      Alexander Graf authored
      
      
      MSR_EFER_SVME_MASK, MSR_VM_CR and MSR_VM_HSAVE_PA are set in KVM
      specific headers. Linux does have nice header files to collect
      EFER bits and MSR IDs, so IMHO we should put them there.
      
      While at it, I also changed the naming scheme to match that
      of the other defines.
      
      (introduced in v6)
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      9962d032
    • Alexander Graf's avatar
      KVM: SVM: Clean up VINTR setting · f0b85051
      Alexander Graf authored
      
      
      The current VINTR intercept setters don't look clean to me. To make
      the code easier to read and enable the possibilty to trap on a VINTR
      set, this uses a helper function to set the VINTR intercept.
      
      v2 uses two distinct functions for setting and clearing the bit
      
      Acked-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      f0b85051
    • Linus Torvalds's avatar
      Linux 2.6.29 · 8e0ee43b
      Linus Torvalds authored
      v2.6.29
      8e0ee43b
    • Kyle McMartin's avatar
      Build with -fno-dwarf2-cfi-asm · 00308649
      Kyle McMartin authored
      
      
      With a sufficiently new compiler and binutils, code which wasn't
      previously generating .eh_frame sections has begun to.  Certain
      architectures (powerpc, in this case) may generate unexpected relocation
      formats in response to this, preventing modules from loading.
      
      While the new relocation types should probably be handled, revert to the
      previous behaviour with regards to generation of .eh_frame sections.
      
      (This was reported against Fedora, which appears to be the only distro
      doing any building against gcc-4.4 at present: RH bz#486545.)
      
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Acked-by: default avatarRoland McGrath <roland@redhat.com>
      Cc: Alexandre Oliva <aoliva@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00308649
    • Jody McIntyre's avatar
      trivial: fix orphan dates in ext2 documentation · 1db4b2d2
      Jody McIntyre authored
      
      
      Revert the change to the orphan dates of Windows 95, DOS, compression.
      Add a new orphan date for OS/2.
      
      Signed-off-by: default avatarJody McIntyre <scjody@sun.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1db4b2d2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · d56ffd38
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
        ucc_geth: Fix oops when using fixed-link support
        dm9000: locking bugfix
        net: update dnet.c for bus_id removal
        dnet: DNET should depend on HAS_IOMEM
        dca: add missing copyright/license headers
        nl80211: Check that function pointer != NULL before using it
        sungem: missing net_device_ops
        be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle
        be2net: replenish when posting to rx-queue is starved in out of mem conditions
        bas_gigaset: correctly allocate USB interrupt transfer buffer
        smsc911x: reset last known duplex and carrier on open
        sh_eth: Fix mistake of the address of SH7763
        sh_eth: Change handling of IRQ
        netns: oops in ip[6]_frag_reasm incrementing stats
        net: kfree(napi->skb) => kfree_skb
        net: fix sctp breakage
        ipv6: fix display of local and remote sit endpoints
        net: Document /proc/sys/net/core/netdev_budget
        tulip: fix crash on iface up with shirq debug
        virtio_net: Make virtio_net support carrier detection
        ...
      d56ffd38
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · 12a37b5e
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
        sparc64: Fix crash with /proc/iomem
        sparc64: Reschedule KGDB capture to a software interrupt.
        sbus: Auto-load openprom module when device opened.
      12a37b5e
    • Miklos Szeredi's avatar
      fix ptrace slowness · 53da1d94
      Miklos Szeredi authored
      
      
      This patch fixes bug #12208:
      
        Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12208
        Subject         : uml is very slow on 2.6.28 host
      
      This turned out to be not a scheduler regression, but an already
      existing problem in ptrace being triggered by subtle scheduler
      changes.
      
      The problem is this:
      
       - task A is ptracing task B
       - task B stops on a trace event
       - task A is woken up and preempts task B
       - task A calls ptrace on task B, which does ptrace_check_attach()
       - this calls wait_task_inactive(), which sees that task B is still on the runq
       - task A goes to sleep for a jiffy
       - ...
      
      Since UML does lots of the above sequences, those jiffies quickly add
      up to make it slow as hell.
      
      This patch solves this by not rescheduling in read_unlock() after
      ptrace_stop() has woken up the tracer.
      
      Thanks to Oleg Nesterov and Ingo Molnar for the feedback.
      
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      CC: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53da1d94
  2. Mar 23, 2009