- Jul 21, 2023
-
-
Guo Ren authored
When s64ilp32 enabled, CONFIG_32BIT=y but __riscv_xlen=64. So we must use __riscv_xlen to detect real machine XLEN for CSR access. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
When s64ilp32 enabled, CONFIG_32BIT=y but __riscv_xlen=64. So we must use __riscv_xlen to detect real machine XLEN for CSR access. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Remove unnecessary configs to make rv32_defconfig have a minimal difference from the defconfig. CONFIG_ARCH_RV32I selects the CONFIG_32BIT, so putting it in the file is unnecessary. Also, there is no need to comment on CONFIG_PORTABLE; it should come from carelessness. Next rv64ilp32_defconfig would like: CONFIG_ARCH_RV64ILP32=y CONFIG_NONPORTABLE=y Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Palmer Dabbelt <palmer@rivosinc.com>
-
Guo Ren authored
Follow the rv32_defconfig rule to add rv64ilp32_defconfig; the only difference is: -CONFIG_ARCH_RV32I=y +CONFIG_ARCH_RV64ILP32=y Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
When rv64ilp32 was introduced, the 32BIT would work with rv64,isa. So use the architecture name instead of the ABI width name. This is an addition to the commit: 069b0d51 ("RISC-V: validate riscv,isa at boot, not during ISA string parsing"). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The u32ilp32 uses the compat_pt_regs instead of the native pt_regs, so we borrow the compat code to support the u32ilp32 signal procedure in the s64ilp32 kernel. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The s64ilp32 supports both u64ilp32 and u32ilp32 ABIs, and their pt_regs differ. So introduce the compat feature to help u32ilp32 ABI. Now u64ilp32 and u32ilp32 applications could work with the s64ilp32 Linux ptrace concurrently. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
When the kernel is built with ILP32 memory model on 64bit ISA and supports ILP32 memory model on 32bit ISA in userspace, the ABIs are different between kernel and userspace, similar to COMPAT, so the option converts the 64-bit arguments of 32ILP32 syscalls to 64ILP32 calling convention. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
There is an existing problem in 64ilp32 gcc that combines two pointers in one register. Liao is solving that problem. Before he finishes the job, we could prevent it with a simple noinline attribute, fortunately. struct path { struct vfsmount *mnt; struct dentry *dentry; } __randomize_layout; struct nameidata { struct path path; struct qstr last; struct path root; ... } __randomize_layout; struct nameidata *nd ... nd->path = nd->root; 6c88 ld a0,24(s1) ^^ // Wrong arg of mntget e088 sd a0,0(s1) // Need inserting "lw a0,0(s1)" here mntget(path->mnt); 2a6150ef jal c01ce946 <mntget> Any gcc helps are welcome :) Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: LiaoShihua <shihua@iscas.ac.cn>
-
Guo Ren authored
The callee saved fp & ra are xlen size, not long size. This patch corrects the layout for the struct stackframe. echo c > /proc/sysrq-trigger Before the patch: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 102 Comm: sh Not tainted 6.3.0-rc1-00084-g9e2ba938797e-dirty #2 Hardware name: riscv-virtio,qemu (DT) Call Trace: ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- After the patch: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 102 Comm: sh Not tainted 6.3.0-rc1-00084-g9e2ba938797e-dirty #1 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<c00050c8>] dump_backtrace+0x1e/0x26 [<c086dcae>] show_stack+0x2e/0x3c [<c0878e00>] dump_stack_lvl+0x40/0x5a [<c0878e30>] dump_stack+0x16/0x1e [<c086df7c>] panic+0x10c/0x2a8 [<c04f4c1e>] sysrq_reset_seq_param_set+0x0/0x76 [<c04f52cc>] __handle_sysrq+0x9c/0x19c [<c04f5946>] write_sysrq_trigger+0x64/0x78 [<c020c7f6>] proc_reg_write+0x4a/0xa2 [<c01acf0a>] vfs_write+0xac/0x308 [<c01ad2b8>] ksys_write+0x62/0xda [<c01ad33e>] sys_write+0xe/0x16 [<c0879860>] do_trap_ecall_u+0xd8/0xda [<c00037de>] ret_from_exception+0x0/0x66 ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Disable the KVM host feature for s64ilp32 first, and let's work on this feature after the s64ilp32 main feature is merged. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Anup Patel <anup@brainfault.org>
-
Guo Ren authored
The s64ilp32 has the ability to exclusively load and store (ld/sd) a pair of words from an address. Then the SLUB can take advantage of a cmpxchg_double implementation to avoid taking some locks. This patch provides an implementation of cmpxchg_double for 64-bit pairs, and activates the logic required for the SLUB to use these functions (HAVE_ALIGNED_STRUCT_PAGE and HAVE_CMPXCHG_DOUBLE). Similar commit: 5284e1b4 ("arm64: xchg: Implement cmpxchg_double") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The s64ilp32 uses 64bit compiler, so it could support “Tetra Integer” mode, which represents a sixteen-byte (128) integer. It's the first 32BIT linux support TImode :) Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The traditional rv32 Linux (s32ilp32) uses a generic version of the lib/atomic64.c, which are inaccurate atomic64 primitives and couldn't co-work with READ_ONCE/WRITE_ONCE, atomic_8/16/32. The s64ilp32 could use native AMO instructions to implement accurate atomic64 primitives. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
There is no MMU_SV32 support in xlen=64 ISA generally, but s64ilp32 selects 32BIT, which uses MMU_SV32 default. This commit enables MMU_SV39 for 32BIT to satisfy the 4GB mapping requirement. The Sv39 is the mandatory MMU mode in RVA20S64 and RVA22S64, so we needn't care about Sv48 & Sv57. We use duplicate remapping to solve the address sign extension problem from the compiler. Make the address of 0xffffffff80000000 equal to 0x80000000 by pg_dir[2] = pg_dir[510] and pg_dir[3] = pg_dir[511] of the page table. Why didn't we prevent address sign extension in the compiler? - Additional zero extension reduces the performance - Prevent complex and unnecessary work for compiler guys. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
This needs to add Sv32 mode in the SATP CSR of RV64 ISA, a novel extension of 64-bit processors' MMU. It could save a bit of page table footprint and improve the page table walk performance: s64ilp32 with Sv39: PageTables: 136 kB s64ilp32 with Sv32: PageTables: 60 kB Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Just the same as ARCH_RV64I & ARCH_RV32I, add ARCH_RV64ILP32 config for s64ilp32 and turn on the s64ilp32 compile switch in the arch/riscv/Makefile. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Use abi_len to distinct ELF32 and ELF64 because s64ilp32 is xlen=64 and abi_len=32 (__SIZEOF_POINTER__=4). And s64ilp32 is an ELF32 based the same as s32ilp32. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kerenl.org>
-
Guo Ren authored
The s64ilp32 uses the rv64 ISA instruction set, not the rv32 ISA. So bpf_jit_comp32.c can't be used for s64ilp32, and we use bpf_jit_comp64.c instead. This patch makes s64ilp32 ebpf jit correct and improves the performance because bpf_jit_comp32.c has significant gaps in mapping ebpf 64-bit ISA. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The RV64ILP32 32-bit Linux kernel uses the same userspace address range as the 64-bit Linux compat mode, about 2GB. They have no difference from the hardware view, and all are running ILP32 on a 64-bit ISA. But the standard 32ilp32 Linux has a slightly bigger userspace address space, about 2.4GB. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
REG_L and REG_S can't satisfy s64ilp32 situation, because its __SIZEOF_POINTER__*8 != __riscv_xlen. So we introduce new PTR_L and PTR_S macro to help head.S and entry.S deal with the pointer data type and replace all REG_L/S by PTR_L/S to fit the current algorithm in memcpy, memove, memset, strcmp, strlen and strncmp. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The s32ilp32 uses 9 bits as asid_bits because of the xlen=32 limitation of CSR. The xlen of s64ilp32 is 64 bits in width, and the SATP CSR format is the same for Sv32, Sv39, Sv48, and Sv57. So this patch makes asid mechanism support s64ilp32 with maximum num_asids. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The sbi uses xlen as base argument elements to connect m-mode and s-mode. The previous implementation assumes sizeof(xlen_t) = sizeof(long), but the s64ilp32's are different. So modify the sbi code suitable with the s64ilp32 change. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
When s64ilp32 landed, we couldn't use CONFIG_64/32BIT to distingue XLEN data types. Because the xlen is 64, but the long & pointer is 32 for s64ilp32, and s64ilp32 is a 32BIT from the software view. So introduce a new data type - "xlen_t" and use __riscv_xlen instead of CONFIG_64/32BIT ifdef macro. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The 64ilp32's long size has been different from the csr xlen, so introduce a new macro of UXL to distinguish UL & ULL. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
This patch didn't introduce any new syscall table but reused the existing rv32 implementation to ease the maintenance of the kernel side. Unify the UXL mode setting by ELF e_flags to support u64ilp32 & u32ilp32. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The u64ilp32 xlen is 64-bit, not the size of long, so change the elements of user_regs_struct with xlen_t to match different __riscv_xlen. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The 64ilp32 uses the same ELF32 as 32ilp32 and the 64lp64 uses ELF64, so separate apply_vdso_alternatives into 64 and 32 versions and serve for three kinds of vDSO - vdso32, vdso64, vdso64ilp32. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The pt_regs of u64ilp32 is the same as the 64-bit kernel's. So, change to use native_view instead. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The u64ilp32 reuses compat mode on the 64-bit Linux kernel, but the signal context is the same as the native 64-bit, not u32ilp32. So use the native signal procedure for u64ilp32 applications. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The 64ilp32 vDSO brings another new abi into riscv, and it needs to adjust the current vDSO flow to enable it. This patch separates the VDSO32 (32ILP32), VDSO64 (64LP64), and VDSO64ILP32 more clearly, and enable VDSO64ILP32 as need. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
This is the first patch to introduce ILP32 abi for RV64. Here is the diagram: +--------------------------------+------------+ | +-------------------+--------+ | +--------+ | | | (compat)|(compat)| | | | | | |u64lp64 u64ilp32|u32ilp32| | |u32ilp32| | ABI | | ^^^^^^^^| | | | | | | +-------------------+--------+ | +--------+ | | +-------------------+--------+ | +--------+ | | | UXL=64 | UXL=32 | | | UXL=32 | | ISA | +-------------------+--------+ | +--------+ | +--------------------------------+------------+------- | +----------------------------+ | +--------+ | | | 64BIT | | | 32BIT| | Kernel | | s64lp64 | | |s32ilp32| | ABI | +----------------------------+ | +--------+ | | +----------------------------+ | +--------+ | | | SXL=64 | | | SXL=32 | | ISA | +----------------------------+ | +--------+ | +--------------------------------+------------+ The 64ilp32 userspace needs another virtual dynamic shared object independent from vdso32(32ilp32) and vdso64(64ilp32). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
This patch adds time-related vDSO common flow for vdso32, and it's an addition to commit: ad5d1122 ("riscv: use vDSO common flow to reduce the latency of the time-related functions"). Then we could reduce the latency of collecting clock information for u32ilp32 (native 32-bit userspace ecosystem), just like what we've done for u64lp64. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
After unifying vdso32 & vdso64 into vdso/, we ever needn't compat_vdso directory. This commit removes the whole compat_vdso/. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Linux kernel abi and vdso abi could be different, and current vdso/Makefile force vdso use the same abi as the kernel. It isn't suitable for the next s64ilp32 patch series because s64ilp32 uses 64ilp32 abi in the kernel but still uses 32ilp32 in the userspace. This patch unifies vdso32 & compat_vdso into vdso/Makefile to solve this problem, similar to Powerpc's vdso framework. Before this: - vdso/ - vdso/vdso.S - vdso/gen_vdso_offsets.sh - compat_vdso/compat_vdso.S - compat_vdso/gen_compat_vdso_offsets.sh - vdso.c After this: - vdso/ - vdso64.S - vdso/gen_vdso64_offsets.sh - vdso32.S - vdso/gen_vdso32_offsets.sh - vdso.c Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via errata mechanism. Here is the func size (Bytes) comparison table below: TYPE : COMBO | TICKET | QUEUED arch_spin_lock : 106 | 60 | 50 arch_spin_unlock : 54 | 36 | 26 arch_spin_trylock : 110 | 72 | 54 arch_spin_is_locked : 48 | 34 | 20 arch_spin_is_contended : 56 | 40 | 24 rch_spin_value_unlocked : 48 | 34 | 24 One example of disassemble combo arch_spin_unlock: 0xffffffff8000409c <+14>: nop # detour slot 0xffffffff800040a0 <+18>: fence rw,w # queued spinlock start 0xffffffff800040a4 <+22>: sb zero,0(a4) # queued spinlock end 0xffffffff800040a8 <+26>: ld s0,8(sp) 0xffffffff800040aa <+28>: addi sp,sp,16 0xffffffff800040ac <+30>: ret 0xffffffff800040ae <+32>: lw a5,0(a4) # ticket spinlock start 0xffffffff800040b0 <+34>: sext.w a5,a5 0xffffffff800040b2 <+36>: fence rw,w 0xffffffff800040b6 <+40>: addiw a5,a5,1 0xffffffff800040b8 <+42>: slli a5,a5,0x30 0xffffffff800040ba <+44>: srli a5,a5,0x30 0xffffffff800040bc <+46>: sh a5,0(a4) # ticket spinlock end 0xffffffff800040c0 <+50>: ld s0,8(sp) 0xffffffff800040c2 <+52>: addi sp,sp,16 0xffffffff800040c4 <+54>: ret The qspinlock is smaller and faster than ticket-lock when all are in fast-path, and combo spinlock could provide a compatible Linux Image for different micro-arch design (weak/strict fwd guarantee) processors. Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
The early version of T-Head C9xx cores has a store merge buffer delay problem. The store merge buffer could improve the store queue performance by merging multi-store requests, but when there are no continued store requests, the prior single store request would be waiting in the store queue for a long time. That would cause significant problems for communication between multi-cores. Appending a fence w.o could immediately flush the store merge buffer and let other cores see the write result. This will apply the WRITE_ONCE errata to handle the non-standard behavior via appending a fence w.o instruction for WRITE_ONCE(). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The requirements of qspinlock have been documented by commit: a8ad07e5 ("asm-generic: qspinlock: Indicate the use of mixed-size atomics"). Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which doesn't satisfy the requirements of qspinlock above, it won't prevent some riscv vendors from implementing a strong fwd guarantee LR/SC in microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor is the one. We've tested the patch on SOPHGO SG2042 [1] and passed the stress test (Fedora/Ubuntu/OpenEuler ...). Here is the performance comparison between qspinlock and ticket_lock: sysbench test=threads threads=32 yields=100 lock=8 (+13.7%): queued_spinlock 0.5109/0.00 ticket_spinlock 0.5814/0.00 perf futex/hash (+6.7%): queued_spinlock 14443937 operations/sec (+- 0.09%) ticket_spinlock 1353215 operations/sec (+- 0.15%) perf futex/wake-parallel (+8.6%): queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%) ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%) perf futex/requeue (4.2%): queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%) ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%) The qspinlock has a significant improvement on SOPHGO SG2042 than the ticket_lock. [1] https://en.sophgo.com/product/introduce/sg2042.html Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
Move ticket-lock definition into an independent file. This is the preparation for the next combo spinlock of riscv. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-