Commit 62cfb35f authored by Alex Kogan's avatar Alex Kogan Committed by Wei Li
Browse files

locking/qspinlock: Introduce CNA into the slow path of qspinlock

maillist inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I8T8XV

Reference: https://lore.kernel.org/linux-arm-kernel/20210514200743.3026725-4-alex.kogan@oracle.com

--------------------------------

In CNA, spinning threads are organized in two queues, a primary queue for
threads running on the same node as the current lock holder, and a
secondary queue for threads running on other nodes. After acquiring the
MCS lock and before acquiring the spinlock, the MCS lock
holder checks whether the next waiter in the primary queue (if exists) is
running on the same NUMA node. If it is not, that waiter is detached from
the main queue and moved into the tail of the secondary queue. This way,
we gradually filter the primary queue, leaving only waiters running on
the same preferred NUMA node. For more details, see
https://arxiv.org/abs/1810.05600

.

Note that this variant of CNA may introduce starvation by continuously
passing the lock between waiters in the main queue. This issue will be
addressed later in the series.

Enabling CNA is controlled via a new configuration option
(NUMA_AWARE_SPINLOCKS). By default, the CNA variant is patched in at the
boot time only if we run on a multi-node machine in native environment and
the new config is enabled. (For the time being, the patching requires
CONFIG_PARAVIRT_SPINLOCKS to be enabled as well. However, this should be
resolved once static_call() is available.) This default behavior can be
overridden with the new kernel boot command-line option
"numa_spinlock=on/off" (default is "auto").

Signed-off-by: default avatarAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: default avatarSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: default avatarWaiman Long <longman@redhat.com>
Signed-off-by: default avatarWei Li <liwei391@huawei.com>
parent 593d2759
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -4024,6 +4024,16 @@
			NUMA balancing.
			Allowed values are enable and disable

	numa_spinlock=	[NUMA, PV_OPS] Select the NUMA-aware variant
			of spinlock. The options are:
			auto - Enable this variant if running on a multi-node
			machine in native environment.
			on  - Unconditionally enable this variant.
			off - Unconditionally disable this variant.

			Not specifying this option is equivalent to
			numa_spinlock=auto.

	numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
			'node', 'default' can be specified
			This can be set from sysctl after boot.
+20 −0
Original line number Diff line number Diff line
@@ -1549,6 +1549,26 @@ config NUMA

	  Otherwise, you should say N.

config NUMA_AWARE_SPINLOCKS
	bool "Numa-aware spinlocks"
	depends on NUMA
	depends on QUEUED_SPINLOCKS
	depends on 64BIT
	# For now, we depend on PARAVIRT_SPINLOCKS to make the patching work.
	# This is awkward, but hopefully would be resolved once static_call()
	# is available.
	depends on PARAVIRT_SPINLOCKS
	default y
	help
	  Introduce NUMA (Non Uniform Memory Access) awareness into
	  the slow path of spinlocks.

	  In this variant of qspinlock, the kernel will try to keep the lock
	  on the same node, thus reducing the number of remote cache misses,
	  while trading some of the short term fairness for better performance.

	  Say N if you want absolute first come first serve fairness.

config AMD_NUMA
	def_bool y
	prompt "Old style AMD Opteron NUMA detection"
+4 −0
Original line number Diff line number Diff line
@@ -27,6 +27,10 @@ static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lo
	return val;
}

#ifdef CONFIG_NUMA_AWARE_SPINLOCKS
extern void cna_configure_spin_lock_slowpath(void);
#endif

#ifdef CONFIG_PARAVIRT_SPINLOCKS
extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
extern void __pv_init_lock_hash(void);
+4 −0
Original line number Diff line number Diff line
@@ -1602,6 +1602,10 @@ void __init alternative_instructions(void)
	 */
	paravirt_set_cap();

#if defined(CONFIG_NUMA_AWARE_SPINLOCKS)
	cna_configure_spin_lock_slowpath();
#endif

	/*
	 * First patch paravirt functions, such that we overwrite the indirect
	 * call with the direct call.
+1 −1
Original line number Diff line number Diff line
@@ -17,7 +17,7 @@

struct mcs_spinlock {
	struct mcs_spinlock *next;
	int locked; /* 1 if lock acquired */
	unsigned int locked; /* 1 if lock acquired */
	int count;  /* nesting count, see qspinlock.c */
};

Loading