- Jul 21, 2023
-
-
Guo Ren authored
This patch adds time-related vDSO common flow for vdso32, and it's an addition to commit: ad5d1122 ("riscv: use vDSO common flow to reduce the latency of the time-related functions"). Then we could reduce the latency of collecting clock information for u32ilp32 (native 32-bit userspace ecosystem), just like what we've done for u64lp64. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
After unifying vdso32 & vdso64 into vdso/, we ever needn't compat_vdso directory. This commit removes the whole compat_vdso/. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Linux kernel abi and vdso abi could be different, and current vdso/Makefile force vdso use the same abi as the kernel. It isn't suitable for the next s64ilp32 patch series because s64ilp32 uses 64ilp32 abi in the kernel but still uses 32ilp32 in the userspace. This patch unifies vdso32 & compat_vdso into vdso/Makefile to solve this problem, similar to Powerpc's vdso framework. Before this: - vdso/ - vdso/vdso.S - vdso/gen_vdso_offsets.sh - compat_vdso/compat_vdso.S - compat_vdso/gen_compat_vdso_offsets.sh - vdso.c After this: - vdso/ - vdso64.S - vdso/gen_vdso64_offsets.sh - vdso32.S - vdso/gen_vdso32_offsets.sh - vdso.c Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via errata mechanism. Here is the func size (Bytes) comparison table below: TYPE : COMBO | TICKET | QUEUED arch_spin_lock : 106 | 60 | 50 arch_spin_unlock : 54 | 36 | 26 arch_spin_trylock : 110 | 72 | 54 arch_spin_is_locked : 48 | 34 | 20 arch_spin_is_contended : 56 | 40 | 24 rch_spin_value_unlocked : 48 | 34 | 24 One example of disassemble combo arch_spin_unlock: 0xffffffff8000409c <+14>: nop # detour slot 0xffffffff800040a0 <+18>: fence rw,w # queued spinlock start 0xffffffff800040a4 <+22>: sb zero,0(a4) # queued spinlock end 0xffffffff800040a8 <+26>: ld s0,8(sp) 0xffffffff800040aa <+28>: addi sp,sp,16 0xffffffff800040ac <+30>: ret 0xffffffff800040ae <+32>: lw a5,0(a4) # ticket spinlock start 0xffffffff800040b0 <+34>: sext.w a5,a5 0xffffffff800040b2 <+36>: fence rw,w 0xffffffff800040b6 <+40>: addiw a5,a5,1 0xffffffff800040b8 <+42>: slli a5,a5,0x30 0xffffffff800040ba <+44>: srli a5,a5,0x30 0xffffffff800040bc <+46>: sh a5,0(a4) # ticket spinlock end 0xffffffff800040c0 <+50>: ld s0,8(sp) 0xffffffff800040c2 <+52>: addi sp,sp,16 0xffffffff800040c4 <+54>: ret The qspinlock is smaller and faster than ticket-lock when all are in fast-path, and combo spinlock could provide a compatible Linux Image for different micro-arch design (weak/strict fwd guarantee) processors. Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
The early version of T-Head C9xx cores has a store merge buffer delay problem. The store merge buffer could improve the store queue performance by merging multi-store requests, but when there are no continued store requests, the prior single store request would be waiting in the store queue for a long time. That would cause significant problems for communication between multi-cores. Appending a fence w.o could immediately flush the store merge buffer and let other cores see the write result. This will apply the WRITE_ONCE errata to handle the non-standard behavior via appending a fence w.o instruction for WRITE_ONCE(). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The requirements of qspinlock have been documented by commit: a8ad07e5 ("asm-generic: qspinlock: Indicate the use of mixed-size atomics"). Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which doesn't satisfy the requirements of qspinlock above, it won't prevent some riscv vendors from implementing a strong fwd guarantee LR/SC in microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor is the one. We've tested the patch on SOPHGO SG2042 [1] and passed the stress test (Fedora/Ubuntu/OpenEuler ...). Here is the performance comparison between qspinlock and ticket_lock: sysbench test=threads threads=32 yields=100 lock=8 (+13.7%): queued_spinlock 0.5109/0.00 ticket_spinlock 0.5814/0.00 perf futex/hash (+6.7%): queued_spinlock 14443937 operations/sec (+- 0.09%) ticket_spinlock 1353215 operations/sec (+- 0.15%) perf futex/wake-parallel (+8.6%): queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%) ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%) perf futex/requeue (4.2%): queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%) ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%) The qspinlock has a significant improvement on SOPHGO SG2042 than the ticket_lock. [1] https://en.sophgo.com/product/introduce/sg2042.html Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Guo Ren authored
Move ticket-lock definition into an independent file. This is the preparation for the next combo spinlock of riscv. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
-
Guo Ren authored
The arch_spinlock_t of qspinlock has contained the atomic_t val, which satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t into qspinlock_types.h. This is the preparation for the next combo spinlock. Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Fix qspinlock issue that loops to call cpu_relax and not exit. The call trace is: queued_spin_lock_slowpath->arch_mcs_spin_lock_contended ->smp_cond_load_acquire. RISCV has not defined smp_cond_load_acquire, so it uses generic funtion that defined in include/asm-generic/barrier.h. The generic smp_cond_load_acquire calls smp_cond_load_relaxed that loops to call READ_ONCE and cpu_relax. The READ_ONCE need barrier after it to get the new value. Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Add FORCE_MAX_ZONEORDER to support custom max order requirements. Default 13 is 16MB for requesting large(16MB) contiguous memory. Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
To avoid pagefault when machine_kexec() call kexec_method that is control_code_buffer. When uses PUD_SIZE as map size, __set_memory only set init_mm So it casues other task page fault when uses the pud entry modified by init_mm. Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
exchange messages and data between the AST2600 BMC and the Host SG2042 over PCIe link using BAR1. Have to set the KCS channel offset is 0x0e80 according to the AST2600 User Guide. Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
Xiaoguang Xing authored
Signed-off-by: Xiaoguang Xing <xiaoguang.xing@sophgo.com>
-
- Jul 10, 2023
-
-
Linus Torvalds authored
-
Linus Torvalds authored
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8 ("MAINTAINERS: re-sort all entries and fields") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
-