Commits · 3904fea181d458a45bb4881a35c033c764ed4875 · Mirrors / github.com / c-sky_csky-linux

Aug 05, 2023

locking/qspinlock: riscv: Add Compact NUMA-aware lock support · 3904fea1

Guo Ren authored Aug 02, 2023



Connect riscv to Compact NUMA-aware lock (CNA), which uses
PRARAVIRT_SPINLOCKS static_call hooks. See numa_spinlock= of
Documentation/admin-guide/kernel-parameters.txt for trying.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

3904fea1

locking/qspinlock: Move pv_ops into x86 directory · e1a08232

Guo Ren authored Aug 02, 2023



The pv_ops belongs to x86 custom infrastructure and cleans up the
cna_configure_spin_lock_slowpath() with standard code. This is
preparation for riscv support CNA qspoinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

e1a08232

RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait · 747bc4d5

Guo Ren authored Aug 02, 2023



Add trace point for pv_kick/wait, here is the output:

 entries-in-buffer/entries-written: 33927/33927   #P:12

                                _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                              | / _---=> hardirq/softirq
                              || / _--=> preempt-depth
                              ||| / _-=> migrate-disable
                              |||| /     delay
           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
              | |         |   |||||     |         |
             sh-100     [001] d..2.    28.312294: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    28.322030: pv_kick: cpu 0 kick target cpu 1
             sh-100     [001] d..2.    30.982631: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    30.993289: pv_kick: cpu 0 kick target cpu 1
             sh-100     [002] d..2.    44.987573: pv_wait: cpu 2 out of wfi
         <idle>-0       [000] d.h4.    44.989000: pv_kick: cpu 0 kick target cpu 2
         <idle>-0       [003] d.s3.    51.593978: pv_kick: cpu 3 kick target cpu 4
      rcu_sched-15      [004] d..2.    51.595192: pv_wait: cpu 4 out of wfi
lock_torture_wr-115     [004] ...2.    52.656482: pv_kick: cpu 4 kick target cpu 2
lock_torture_wr-113     [002] d..2.    52.659146: pv_wait: cpu 2 out of wfi
lock_torture_wr-114     [008] d..2.    52.659507: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [008] d..2.    52.663503: pv_wait: cpu 8 out of wfi
lock_torture_wr-113     [002] ...2.    52.666128: pv_kick: cpu 2 kick target cpu 8
lock_torture_wr-114     [008] d..2.    52.667261: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [009] .n.2.    53.141515: pv_kick: cpu 9 kick target cpu 11
lock_torture_wr-113     [002] d..2.    53.143339: pv_wait: cpu 2 out of wfi
lock_torture_wr-116     [007] d..2.    53.143412: pv_wait: cpu 7 out of wfi
lock_torture_wr-118     [000] d..2.    53.143457: pv_wait: cpu 0 out of wfi
lock_torture_wr-115     [008] d..2.    53.143481: pv_wait: cpu 8 out of wfi
lock_torture_wr-117     [011] d..2.    53.143522: pv_wait: cpu 11 out of wfi
lock_torture_wr-117     [011] ...2.    53.143987: pv_kick: cpu 11 kick target cpu 8
lock_torture_wr-115     [008] ...2.    53.144269: pv_kick: cpu 8 kick target cpu 7

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

747bc4d5

RISC-V: paravirt: pvqspinlock: Add kconfig entry · acc1f81b

Guo Ren authored Jul 30, 2023



Add kconfig entry for paravirt_spinlock, an unfair qspinlock
virtualization-friendly backend, by halting the virtual CPU rather
than spinning.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

acc1f81b

RISC-V: paravirt: pvqspinlock: Add SBI implementation · e9163f36

Guo Ren authored Aug 01, 2023



Implement pv_kick with SBI implementation, and add SBI_EXT_PVLOCK
extension detection.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

e9163f36

RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter · 9bdf4d65

Guo Ren authored Jul 31, 2023



Disables the qspinlock slow path using PV optimizations which
allow the hypervisor to 'idle' the guest on lock contention.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

9bdf4d65

RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu() · 4c6c3342

Guo Ren authored Aug 02, 2023



We only need to call the kvm_vcpu_kick() and bring target_vcpu
from the halt state. No irq raised, no other request, just a pure
vcpu_kick.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

4c6c3342

RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton · c328a9f5

Guo Ren authored Aug 01, 2023



Add the files functions needed to support the SBI PVLOCK (paravirt
qspinlock kick_cpu) extension. This is a preparation for the next
core implementation of kick_cpu.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

c328a9f5

RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton · fd5361d7

Guo Ren authored Jul 31, 2023



Using static_call to switch between:
  native_queued_spin_lock_slowpath()    __pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()           __pv_queued_spin_unlock()

Finish the pv_wait implementation, but pv_kick needs the SBI
definition of the next patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

fd5361d7

riscv: qspinlock: Use new static key for controlling call of virt_spin_lock() · 226dde60

Guo Ren authored Jul 27, 2023



Add a static key controlling whether virt_spin_lock() should be
called or not. When running on bare metal set the new key to
false.

The KVM guests fall back to a Test-and-Set spinlock, because fair
locks have horrible lock 'holder' preemption issues. The
virt_spin_lock_key would shortcut for the
queued_spin_lock_slowpath() function that allow virt_spin_lock to
hijack it.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

226dde60

riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors · b922c42a

Guo Ren authored Aug 05, 2023



According to qspinlock requirements, RISC-V gives out a weak LR/SC
forward progress guarantee which does not satisfy qspinlock. But
many vendors could produce stronger forward guarantee LR/SC to
ensure the xchg_tail could be finished in time on any kind of
hart. T-HEAD is the vendor which implements strong forward
guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
with errata init help.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

b922c42a

riscv: qspinlock: Allow force qspinlock from the command line · e5aa0b56

Guo Ren authored Jul 29, 2023



Allow cmdline to force the kernel to use queued_spinlock when
CONFIG_RISCV_COMBO_SPINLOCKS=y.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

e5aa0b56

riscv: qspinlock: Introduce combo spinlock · be425281

Guo Ren authored Aug 08, 2022



Combo spinlock could support queued and ticket in one Linux Image and
select them during boot time via errata mechanism. Here is the func
size (Bytes) comparison table below:

TYPE			: COMBO | TICKET | QUEUED
arch_spin_lock		: 106	| 60     | 50
arch_spin_unlock	: 54    | 36     | 26
arch_spin_trylock	: 110   | 72     | 54
arch_spin_is_locked	: 48    | 34     | 20
arch_spin_is_contended	: 56    | 40     | 24
rch_spin_value_unlocked	: 48    | 34     | 24

One example of disassemble combo arch_spin_unlock:
   0xffffffff8000409c <+14>:    nop                # detour slot
   0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
   0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
   0xffffffff800040a8 <+26>:    ld      s0,8(sp)
   0xffffffff800040aa <+28>:    addi    sp,sp,16
   0xffffffff800040ac <+30>:    ret
   0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
   0xffffffff800040b0 <+34>:    sext.w  a5,a5
   0xffffffff800040b2 <+36>:    fence   rw,w
   0xffffffff800040b6 <+40>:    addiw   a5,a5,1
   0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
   0xffffffff800040ba <+44>:    srli    a5,a5,0x30
   0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
   0xffffffff800040c0 <+50>:    ld      s0,8(sp)
   0xffffffff800040c2 <+52>:    addi    sp,sp,16
   0xffffffff800040c4 <+54>:    ret

The qspinlock is smaller and faster than ticket-lock when all are in
fast-path, and combo spinlock could provide a compatible Linux Image
for different micro-arch design (weak/strict fwd guarantee LR/SC)
processors.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>

be425281

riscv: qspinlock: Add basic queued_spinlock support · 6e0a5882

Guo Ren authored Aug 08, 2022

The requirements of qspinlock have been documented by commit:
a8ad07e5 ("asm-generic: qspinlock: Indicate the use of mixed-size
atomics").

Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which
doesn't satisfy the requirements of qspinlock above, it won't prevent
some riscv vendors from implementing a strong fwd guarantee LR/SC in
microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor
is the one.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
queued_spinlock 0.5109/0.00
ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
queued_spinlock 14443937 operations/sec (+- 0.09%)
ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
queued_spinlock:
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 628613745.4 53865.8
Double-Precision Whetstone 55.0 182422.8 33167.8
Execl Throughput 43.0 13116.6 3050.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 7762306.2 19601.8
File Copy 256 bufsize 500 maxblocks 1655.0 3417556.8 20649.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 7427995.7 12806.9
Pipe Throughput 12440.0 23058600.5 18535.9
Pipe-based Context Switching 4000.0 2835617.7 7089.0
Process Creation 126.0 12537.3 995.0
Shell Scripts (1 concurrent) 42.4 57057.4 13456.9
Shell Scripts (8 concurrent) 6.0 7367.1 12278.5
System Call Overhead 15000.0 33308301.3 22205.5
========
System Benchmarks Index Score 12426.1

ticket_spinlock:
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 626541701.9 53688.2
Double-Precision Whetstone 55.0 181921.0 33076.5
Execl Throughput 43.0 12625.1 2936.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 6553792.9 16550.0
File Copy 256 bufsize 500 maxblocks 1655.0 3189231.6 19270.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 7221277.0 12450.5
Pipe Throughput 12440.0 20594018.7 16554.7
Pipe-based Context Switching 4000.0 2571117.7 6427.8
Process Creation 126.0 10798.4 857.0
Shell Scripts (1 concurrent) 42.4 57227.5 13497.1
Shell Scripts (8 concurrent) 6.0 7329.2 12215.3
System Call Overhead 15000.0 30766778.4 20511.2
========
System Benchmarks Index Score 11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>

6e0a5882

riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup · 1af84b48

Guo Ren authored Jul 20, 2023



The early version of T-Head C9xx cores has a store merge buffer
delay problem. The store merge buffer could improve the store queue
performance by merging multi-store requests, but when there are not
continued store requests, the prior single store request would be
waiting in the store queue for a long time. That would cause
significant problems for communication between multi-cores. This
problem was found on sg2042 & th1520 platforms with the qspinlock
lock torture test.

So appending a fence w.o could immediately flush the store merge
buffer and let other cores see the write result.

This will apply the WRITE_ONCE errata to handle the non-standard
behavior via appending a fence w.o instruction for WRITE_ONCE().

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

1af84b48

asm-generic: ticket-lock: Move into ticket_spinlock.h · 28794cd9

Guo Ren authored Jul 19, 2023



Move ticket-lock definition into an independent file. This is the
preparation for the next combo spinlock of riscv.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

28794cd9

asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock · da4c87a0

Guo Ren authored Aug 08, 2022



The arch_spinlock_t of qspinlock has contained the atomic_t val, which
satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t
into qspinlock_types.h. This is the preparation for the next combo
spinlock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>

da4c87a0

asm-generic: ticket-lock: Optimize arch_spin_value_unlocked · 6e538e74

Guo Ren authored Jul 30, 2023



The arch_spin_value_unlocked would cause an unnecessary memory
access to the contended value. Although it won't cause a significant
performance gap in most architectures, the arch_spin_value_unlocked
argument contains enough information. Thus, remove unnecessary
atomic_read in arch_spin_value_unlocked().

The caller of arch_spin_value_unlocked() could benefit from this
change. Currently, the only caller is lockref.

Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>

6e538e74

riscv: cmpxchg: Add xchg_small & cmpxchg_small support · ce91fa27

Guo Ren authored Aug 01, 2023



The pvqspinlock needs additional sub-word atomic operations. Here
is the list:
 - xchg8  (RCsc)
 - xchg16 (Relaxed)
 - cmpxchg8/16_relaxed
 - cmpxchg8/16_release (Rcpc)
 - cmpxchg8_acquire (RCpc)
 - cmpxchg8 (RCsc)

Although paravirt qspinlock doesn't have the native_qspinlock
fairness, giving a strong forward progress guarantee to these
atomic semantics could prevent unnecessary tries, which would
cause cache line bouncing.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

ce91fa27

riscv: cmpxchg: Remove unnecessary definitions of cmpxchg & xchg · f3d049e4

Guo Ren authored Aug 01, 2023



The custom xchg/cmpxchg_release macro definitions have no
difference from the common code from the binary view. The
xchg32/64 macro definitions have been abandoned in Linux. Thus,
remove all of them.

This is a preparation for the next cmpxchg_small & xchg8 patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>

f3d049e4

Aug 02, 2023

RISC-V: paravirt: Add kconfigs · b17aefd2

Andrew Jones authored Apr 14, 2023



Now that we can support steal-time accounting, add the kconfig
knobs allowing it to be enabled.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

b17aefd2

RISC-V: paravirt: Implement steal-time support · badd1726

Andrew Jones authored Apr 14, 2023



When the SBI STA extension exists we can use it to implement
paravirt steal-time support. Fill in the empty pv-time functions
with an SBI STA implementation.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>

badd1726

RISC-V: Add SBI STA extension definitions · 7f9edf71

Andrew Jones authored Apr 14, 2023



The SBI STA extension enables steal-time accounting. Add the
definitions it specifies.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

7f9edf71

RISC-V: paravirt: Add skeleton for pv-time support · a7c5da7c

Andrew Jones authored Apr 14, 2023



Add the files and functions needed to support paravirt time on
RISC-V. Also include the common code needed for the first
application of pv-time, which is steal-time. In the next
patches we'll complete the functions to fully enable steal-time
support.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>

a7c5da7c

locking/qspinlock: Introduce the shuffle reduction optimization into CNA · ccca71e5