Commit 542a2640 authored by Paolo Bonzini's avatar Paolo Bonzini
Browse files

Merge tag 'kvm-riscv-5.16-1' of git://github.com/kvm-riscv/linux into HEAD

Initial KVM RISC-V support

Following features are supported by the initial KVM RISC-V support:
1. No RISC-V specific KVM IOCTL
2. Loadable KVM RISC-V module
3. Minimal possible KVM world-switch which touches only GPRs and few CSRs
4. Works on both RV64 and RV32 host
5. Full Guest/VM switch via vcpu_get/vcpu_put infrastructure
6. KVM ONE_REG interface for VCPU register access from KVM user-space
7. Interrupt controller emulation in KVM user-space
8. Timer and IPI emuation in kernel
9. Both Sv39x4 and Sv48x4 supported for RV64 host
10. MMU notifiers supported
11. Generic dirty log supported
12. FP lazy save/restore supported
13. SBI v0.1 emulation for Guest/VM
14. Forward unhandled SBI calls to KVM user-space
15. Hugepage support for Guest/VM
16. IOEVENTFD support for Vhost
parents deae4a10 24b699d1
Loading
Loading
Loading
Loading
+184 −9
Original line number Diff line number Diff line
@@ -532,7 +532,7 @@ translation mode.
------------------

:Capability: basic
:Architectures: x86, ppc, mips
:Architectures: x86, ppc, mips, riscv
:Type: vcpu ioctl
:Parameters: struct kvm_interrupt (in)
:Returns: 0 on success, negative on failure.
@@ -601,6 +601,23 @@ interrupt number dequeues the interrupt.

This is an asynchronous vcpu ioctl and can be invoked from any thread.

RISC-V:
^^^^^^^

Queues an external interrupt to be injected into the virutal CPU. This ioctl
is overloaded with 2 different irq values:

a) KVM_INTERRUPT_SET

   This sets external interrupt for a virtual CPU and it will receive
   once it is ready.

b) KVM_INTERRUPT_UNSET

   This clears pending external interrupt for a virtual CPU.

This is an asynchronous vcpu ioctl and can be invoked from any thread.


4.17 KVM_DEBUG_GUEST
--------------------
@@ -1399,7 +1416,7 @@ for vm-wide capabilities.
---------------------

:Capability: KVM_CAP_MP_STATE
:Architectures: x86, s390, arm, arm64
:Architectures: x86, s390, arm, arm64, riscv
:Type: vcpu ioctl
:Parameters: struct kvm_mp_state (out)
:Returns: 0 on success; -1 on error
@@ -1416,7 +1433,8 @@ uniprocessor guests).
Possible values are:

   ==========================    ===============================================
   KVM_MP_STATE_RUNNABLE         the vcpu is currently running [x86,arm/arm64]
   KVM_MP_STATE_RUNNABLE         the vcpu is currently running
                                 [x86,arm/arm64,riscv]
   KVM_MP_STATE_UNINITIALIZED    the vcpu is an application processor (AP)
                                 which has not yet received an INIT signal [x86]
   KVM_MP_STATE_INIT_RECEIVED    the vcpu has received an INIT signal, and is
@@ -1425,7 +1443,7 @@ Possible values are:
                                 is waiting for an interrupt [x86]
   KVM_MP_STATE_SIPI_RECEIVED    the vcpu has just received a SIPI (vector
                                 accessible via KVM_GET_VCPU_EVENTS) [x86]
   KVM_MP_STATE_STOPPED          the vcpu is stopped [s390,arm/arm64]
   KVM_MP_STATE_STOPPED          the vcpu is stopped [s390,arm/arm64,riscv]
   KVM_MP_STATE_CHECK_STOP       the vcpu is in a special error state [s390]
   KVM_MP_STATE_OPERATING        the vcpu is operating (running or halted)
                                 [s390]
@@ -1437,8 +1455,8 @@ On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
these architectures.

For arm/arm64:
^^^^^^^^^^^^^^
For arm/arm64/riscv:
^^^^^^^^^^^^^^^^^^^^

The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
@@ -1447,7 +1465,7 @@ KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
---------------------

:Capability: KVM_CAP_MP_STATE
:Architectures: x86, s390, arm, arm64
:Architectures: x86, s390, arm, arm64, riscv
:Type: vcpu ioctl
:Parameters: struct kvm_mp_state (in)
:Returns: 0 on success; -1 on error
@@ -1459,8 +1477,8 @@ On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
in-kernel irqchip, the multiprocessing state must be maintained by userspace on
these architectures.

For arm/arm64:
^^^^^^^^^^^^^^
For arm/arm64/riscv:
^^^^^^^^^^^^^^^^^^^^

The only states that are valid are KVM_MP_STATE_STOPPED and
KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
@@ -2577,6 +2595,144 @@ following id bit patterns::

  0x7020 0000 0003 02 <0:3> <reg:5>

RISC-V registers are mapped using the lower 32 bits. The upper 8 bits of
that is the register group type.

RISC-V config registers are meant for configuring a Guest VCPU and it has
the following id bit patterns::

  0x8020 0000 01 <index into the kvm_riscv_config struct:24> (32bit Host)
  0x8030 0000 01 <index into the kvm_riscv_config struct:24> (64bit Host)

Following are the RISC-V config registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0100 0000 isa       ISA feature bitmap of Guest VCPU
======================= ========= =============================================

The isa config register can be read anytime but can only be written before
a Guest VCPU runs. It will have ISA feature bits matching underlying host
set by default.

RISC-V core registers represent the general excution state of a Guest VCPU
and it has the following id bit patterns::

  0x8020 0000 02 <index into the kvm_riscv_core struct:24> (32bit Host)
  0x8030 0000 02 <index into the kvm_riscv_core struct:24> (64bit Host)

Following are the RISC-V core registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0200 0000 regs.pc   Program counter
  0x80x0 0000 0200 0001 regs.ra   Return address
  0x80x0 0000 0200 0002 regs.sp   Stack pointer
  0x80x0 0000 0200 0003 regs.gp   Global pointer
  0x80x0 0000 0200 0004 regs.tp   Task pointer
  0x80x0 0000 0200 0005 regs.t0   Caller saved register 0
  0x80x0 0000 0200 0006 regs.t1   Caller saved register 1
  0x80x0 0000 0200 0007 regs.t2   Caller saved register 2
  0x80x0 0000 0200 0008 regs.s0   Callee saved register 0
  0x80x0 0000 0200 0009 regs.s1   Callee saved register 1
  0x80x0 0000 0200 000a regs.a0   Function argument (or return value) 0
  0x80x0 0000 0200 000b regs.a1   Function argument (or return value) 1
  0x80x0 0000 0200 000c regs.a2   Function argument 2
  0x80x0 0000 0200 000d regs.a3   Function argument 3
  0x80x0 0000 0200 000e regs.a4   Function argument 4
  0x80x0 0000 0200 000f regs.a5   Function argument 5
  0x80x0 0000 0200 0010 regs.a6   Function argument 6
  0x80x0 0000 0200 0011 regs.a7   Function argument 7
  0x80x0 0000 0200 0012 regs.s2   Callee saved register 2
  0x80x0 0000 0200 0013 regs.s3   Callee saved register 3
  0x80x0 0000 0200 0014 regs.s4   Callee saved register 4
  0x80x0 0000 0200 0015 regs.s5   Callee saved register 5
  0x80x0 0000 0200 0016 regs.s6   Callee saved register 6
  0x80x0 0000 0200 0017 regs.s7   Callee saved register 7
  0x80x0 0000 0200 0018 regs.s8   Callee saved register 8
  0x80x0 0000 0200 0019 regs.s9   Callee saved register 9
  0x80x0 0000 0200 001a regs.s10  Callee saved register 10
  0x80x0 0000 0200 001b regs.s11  Callee saved register 11
  0x80x0 0000 0200 001c regs.t3   Caller saved register 3
  0x80x0 0000 0200 001d regs.t4   Caller saved register 4
  0x80x0 0000 0200 001e regs.t5   Caller saved register 5
  0x80x0 0000 0200 001f regs.t6   Caller saved register 6
  0x80x0 0000 0200 0020 mode      Privilege mode (1 = S-mode or 0 = U-mode)
======================= ========= =============================================

RISC-V csr registers represent the supervisor mode control/status registers
of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 03 <index into the kvm_riscv_csr struct:24> (32bit Host)
  0x8030 0000 03 <index into the kvm_riscv_csr struct:24> (64bit Host)

Following are the RISC-V csr registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0300 0000 sstatus   Supervisor status
  0x80x0 0000 0300 0001 sie       Supervisor interrupt enable
  0x80x0 0000 0300 0002 stvec     Supervisor trap vector base
  0x80x0 0000 0300 0003 sscratch  Supervisor scratch register
  0x80x0 0000 0300 0004 sepc      Supervisor exception program counter
  0x80x0 0000 0300 0005 scause    Supervisor trap cause
  0x80x0 0000 0300 0006 stval     Supervisor bad address or instruction
  0x80x0 0000 0300 0007 sip       Supervisor interrupt pending
  0x80x0 0000 0300 0008 satp      Supervisor address translation and protection
======================= ========= =============================================

RISC-V timer registers represent the timer state of a Guest VCPU and it has
the following id bit patterns::

  0x8030 0000 04 <index into the kvm_riscv_timer struct:24>

Following are the RISC-V timer registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8030 0000 0400 0000 frequency Time base frequency (read-only)
  0x8030 0000 0400 0001 time      Time value visible to Guest
  0x8030 0000 0400 0002 compare   Time compare programmed by Guest
  0x8030 0000 0400 0003 state     Time compare state (1 = ON or 0 = OFF)
======================= ========= =============================================

RISC-V F-extension registers represent the single precision floating point
state of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 05 <index into the __riscv_f_ext_state struct:24>

Following are the RISC-V F-extension registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8020 0000 0500 0000 f[0]      Floating point register 0
  ...
  0x8020 0000 0500 001f f[31]     Floating point register 31
  0x8020 0000 0500 0020 fcsr      Floating point control and status register
======================= ========= =============================================

RISC-V D-extension registers represent the double precision floating point
state of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 06 <index into the __riscv_d_ext_state struct:24> (fcsr)
  0x8030 0000 06 <index into the __riscv_d_ext_state struct:24> (non-fcsr)

Following are the RISC-V D-extension registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8030 0000 0600 0000 f[0]      Floating point register 0
  ...
  0x8030 0000 0600 001f f[31]     Floating point register 31
  0x8020 0000 0600 0020 fcsr      Floating point control and status register
======================= ========= =============================================


4.69 KVM_GET_ONE_REG
--------------------
@@ -5848,6 +6004,25 @@ Valid values for 'type' are:
    Userspace is expected to place the hypercall result into the appropriate
    field before invoking KVM_RUN again.

::

		/* KVM_EXIT_RISCV_SBI */
		struct {
			unsigned long extension_id;
			unsigned long function_id;
			unsigned long args[6];
			unsigned long ret[2];
		} riscv_sbi;
If exit reason is KVM_EXIT_RISCV_SBI then it indicates that the VCPU has
done a SBI call which is not handled by KVM RISC-V kernel module. The details
of the SBI call are available in 'riscv_sbi' member of kvm_run structure. The
'extension_id' field of 'riscv_sbi' represents SBI extension ID whereas the
'function_id' field represents function ID of given SBI extension. The 'args'
array field of 'riscv_sbi' represents parameters for the SBI call and 'ret'
array field represents return values. The userspace should update the return
values of SBI call before resuming the VCPU. For more details on RISC-V SBI
spec refer, https://github.com/riscv/riscv-sbi-doc.

::

		/* Fix the size of the union. */
+12 −0
Original line number Diff line number Diff line
@@ -10270,6 +10270,18 @@ F: arch/powerpc/include/uapi/asm/kvm*
F:	arch/powerpc/kernel/kvm*
F:	arch/powerpc/kvm/
KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)
M:	Anup Patel <anup.patel@wdc.com>
R:	Atish Patra <atish.patra@wdc.com>
L:	kvm@vger.kernel.org
L:	kvm-riscv@lists.infradead.org
L:	linux-riscv@lists.infradead.org
S:	Maintained
T:	git git://github.com/kvm-riscv/linux.git
F:	arch/riscv/include/asm/kvm*
F:	arch/riscv/include/uapi/asm/kvm*
F:	arch/riscv/kvm/
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
M:	Christian Borntraeger <borntraeger@de.ibm.com>
M:	Janosch Frank <frankja@linux.ibm.com>
+1 −0
Original line number Diff line number Diff line
@@ -562,4 +562,5 @@ source "kernel/power/Kconfig"

endmenu

source "arch/riscv/kvm/Kconfig"
source "drivers/firmware/Kconfig"
+1 −0
Original line number Diff line number Diff line
@@ -100,6 +100,7 @@ endif
head-y := arch/riscv/kernel/head.o

core-$(CONFIG_RISCV_ERRATA_ALTERNATIVE) += arch/riscv/errata/
core-$(CONFIG_KVM) += arch/riscv/kvm/

libs-y += arch/riscv/lib/
libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
+87 −0
Original line number Diff line number Diff line
@@ -58,22 +58,32 @@

/* Interrupt causes (minus the high bit) */
#define IRQ_S_SOFT		1
#define IRQ_VS_SOFT		2
#define IRQ_M_SOFT		3
#define IRQ_S_TIMER		5
#define IRQ_VS_TIMER		6
#define IRQ_M_TIMER		7
#define IRQ_S_EXT		9
#define IRQ_VS_EXT		10
#define IRQ_M_EXT		11

/* Exception causes */
#define EXC_INST_MISALIGNED	0
#define EXC_INST_ACCESS		1
#define EXC_INST_ILLEGAL	2
#define EXC_BREAKPOINT		3
#define EXC_LOAD_ACCESS		5
#define EXC_STORE_ACCESS	7
#define EXC_SYSCALL		8
#define EXC_HYPERVISOR_SYSCALL	9
#define EXC_SUPERVISOR_SYSCALL	10
#define EXC_INST_PAGE_FAULT	12
#define EXC_LOAD_PAGE_FAULT	13
#define EXC_STORE_PAGE_FAULT	15
#define EXC_INST_GUEST_PAGE_FAULT	20
#define EXC_LOAD_GUEST_PAGE_FAULT	21
#define EXC_VIRTUAL_INST_FAULT		22
#define EXC_STORE_GUEST_PAGE_FAULT	23

/* PMP configuration */
#define PMP_R			0x01
@@ -85,6 +95,58 @@
#define PMP_A_NAPOT		0x18
#define PMP_L			0x80

/* HSTATUS flags */
#ifdef CONFIG_64BIT
#define HSTATUS_VSXL		_AC(0x300000000, UL)
#define HSTATUS_VSXL_SHIFT	32
#endif
#define HSTATUS_VTSR		_AC(0x00400000, UL)
#define HSTATUS_VTW		_AC(0x00200000, UL)
#define HSTATUS_VTVM		_AC(0x00100000, UL)
#define HSTATUS_VGEIN		_AC(0x0003f000, UL)
#define HSTATUS_VGEIN_SHIFT	12
#define HSTATUS_HU		_AC(0x00000200, UL)
#define HSTATUS_SPVP		_AC(0x00000100, UL)
#define HSTATUS_SPV		_AC(0x00000080, UL)
#define HSTATUS_GVA		_AC(0x00000040, UL)
#define HSTATUS_VSBE		_AC(0x00000020, UL)

/* HGATP flags */
#define HGATP_MODE_OFF		_AC(0, UL)
#define HGATP_MODE_SV32X4	_AC(1, UL)
#define HGATP_MODE_SV39X4	_AC(8, UL)
#define HGATP_MODE_SV48X4	_AC(9, UL)

#define HGATP32_MODE_SHIFT	31
#define HGATP32_VMID_SHIFT	22
#define HGATP32_VMID_MASK	_AC(0x1FC00000, UL)
#define HGATP32_PPN		_AC(0x003FFFFF, UL)

#define HGATP64_MODE_SHIFT	60
#define HGATP64_VMID_SHIFT	44
#define HGATP64_VMID_MASK	_AC(0x03FFF00000000000, UL)
#define HGATP64_PPN		_AC(0x00000FFFFFFFFFFF, UL)

#define HGATP_PAGE_SHIFT	12

#ifdef CONFIG_64BIT
#define HGATP_PPN		HGATP64_PPN
#define HGATP_VMID_SHIFT	HGATP64_VMID_SHIFT
#define HGATP_VMID_MASK		HGATP64_VMID_MASK
#define HGATP_MODE_SHIFT	HGATP64_MODE_SHIFT
#else
#define HGATP_PPN		HGATP32_PPN
#define HGATP_VMID_SHIFT	HGATP32_VMID_SHIFT
#define HGATP_VMID_MASK		HGATP32_VMID_MASK
#define HGATP_MODE_SHIFT	HGATP32_MODE_SHIFT
#endif

/* VSIP & HVIP relation */
#define VSIP_TO_HVIP_SHIFT	(IRQ_VS_SOFT - IRQ_S_SOFT)
#define VSIP_VALID_MASK		((_AC(1, UL) << IRQ_S_SOFT) | \
				 (_AC(1, UL) << IRQ_S_TIMER) | \
				 (_AC(1, UL) << IRQ_S_EXT))

/* symbolic CSR names: */
#define CSR_CYCLE		0xc00
#define CSR_TIME		0xc01
@@ -104,6 +166,31 @@
#define CSR_SIP			0x144
#define CSR_SATP		0x180

#define CSR_VSSTATUS		0x200
#define CSR_VSIE		0x204
#define CSR_VSTVEC		0x205
#define CSR_VSSCRATCH		0x240
#define CSR_VSEPC		0x241
#define CSR_VSCAUSE		0x242
#define CSR_VSTVAL		0x243
#define CSR_VSIP		0x244
#define CSR_VSATP		0x280

#define CSR_HSTATUS		0x600
#define CSR_HEDELEG		0x602
#define CSR_HIDELEG		0x603
#define CSR_HIE			0x604
#define CSR_HTIMEDELTA		0x605
#define CSR_HCOUNTEREN		0x606
#define CSR_HGEIE		0x607
#define CSR_HTIMEDELTAH		0x615
#define CSR_HTVAL		0x643
#define CSR_HIP			0x644
#define CSR_HVIP		0x645
#define CSR_HTINST		0x64a
#define CSR_HGATP		0x680
#define CSR_HGEIP		0xe12

#define CSR_MSTATUS		0x300
#define CSR_MISA		0x301
#define CSR_MIE			0x304
Loading