Commit 7d9d077c authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull RCU updates from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes

 - Callback-offload updates, perhaps most notably a new
   RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be
   offloaded at boot time, regardless of kernel boot parameters.

   This is useful to battery-powered systems such as ChromeOS and
   Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot
   parameter prevents offloaded callbacks from interfering with
   real-time workloads and with energy-efficiency mechanisms

 - Polled grace-period updates, perhaps most notably making these APIs
   account for both normal and expedited grace periods

 - Tasks RCU updates, perhaps most notably reducing the CPU overhead of
   RCU tasks trace grace periods by more than a factor of two on a
   system with 15,000 tasks.

   The reduction is expected to increase with the number of tasks, so it
   seems reasonable to hypothesize that a system with 150,000 tasks
   might see a 20-fold reduction in CPU overhead

 - Torture-test updates

 - Updates that merge RCU's dyntick-idle tracking into context tracking,
   thus reducing the overhead of transitioning to kernel mode from
   either idle or nohz_full userspace execution for kernels that track
   context independently of RCU.

   This is expected to be helpful primarily for kernels built with
   CONFIG_NO_HZ_FULL=y

* tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits)
  rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings
  rcu: Diagnose extended sync_rcu_do_polled_gp() loops
  rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings
  rcutorture: Test polled expedited grace-period primitives
  rcu: Add polled expedited grace-period primitives
  rcutorture: Verify that polled GP API sees synchronous grace periods
  rcu: Make Tiny RCU grace periods visible to polled APIs
  rcu: Make polled grace-period API account for expedited grace periods
  rcu: Switch polled grace-period APIs to ->gp_seq_polled
  rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty
  rcu/nocb: Add option to opt rcuo kthreads out of RT priority
  rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread()
  rcu/nocb: Add an option to offload all CPUs on boot
  rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call
  rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order
  rcu/nocb: Add/del rdp to iterate from rcuog itself
  rcu/tree: Add comment to describe GP-done condition in fqs loop
  rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs()
  rcu/kvfree: Remove useless monitor_todo flag
  rcu: Cleanup RCU urgency state for offline CPU
  ...
parents c2a24a7a 34bc7b45
Loading
Loading
Loading
Loading
+5 −5
Original line number Diff line number Diff line
@@ -1844,10 +1844,10 @@ that meets this requirement.

Furthermore, NMI handlers can be interrupted by what appear to RCU to be
normal interrupts. One way that this can happen is for code that
directly invokes rcu_irq_enter() and rcu_irq_exit() to be called
directly invokes ct_irq_enter() and ct_irq_exit() to be called
from an NMI handler. This astonishing fact of life prompted the current
code structure, which has rcu_irq_enter() invoking
rcu_nmi_enter() and rcu_irq_exit() invoking rcu_nmi_exit().
code structure, which has ct_irq_enter() invoking
ct_nmi_enter() and ct_irq_exit() invoking ct_nmi_exit().
And yes, I also learned of this requirement the hard way.

Loadable Modules
@@ -2195,7 +2195,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
   sections, and RCU believes this CPU to be idle, no problem. This
   sort of thing is used by some architectures for light-weight
   exception handlers, which can then avoid the overhead of
   rcu_irq_enter() and rcu_irq_exit() at exception entry and
   ct_irq_enter() and ct_irq_exit() at exception entry and
   exit, respectively. Some go further and avoid the entireties of
   irq_enter() and irq_exit().
   Just make very sure you are running some of your tests with
@@ -2226,7 +2226,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so    |
| One approach is to do ``ct_irq_exit();ct_irq_enter();`` every so      |
| often. But given that long-running interrupt handlers can cause other |
| problems, not least for response time, shouldn't you work to keep     |
| your interrupt handler's runtime within reasonable bounds?            |
+3 −3
Original line number Diff line number Diff line
@@ -97,12 +97,12 @@ warnings:
	which will include additional debugging information.

-	A low-level kernel issue that either fails to invoke one of the
	variants of rcu_user_enter(), rcu_user_exit(), rcu_idle_enter(),
	rcu_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
	variants of rcu_eqs_enter(true), rcu_eqs_exit(true), ct_idle_enter(),
	ct_idle_exit(), ct_irq_enter(), or ct_irq_exit() on the one
	hand, or that invokes one of them too many times on the other.
	Historically, the most frequent issue has been an omission
	of either irq_enter() or irq_exit(), which in turn invoke
	rcu_irq_enter() or rcu_irq_exit(), respectively.  Building your
	ct_irq_enter() or ct_irq_exit(), respectively.  Building your
	kernel with CONFIG_RCU_EQS_DEBUG=y can help track down these types
	of issues, which sometimes arise in architecture-specific code.

+34 −0
Original line number Diff line number Diff line
@@ -3667,6 +3667,9 @@
			just as if they had also been called out in the
			rcu_nocbs= boot parameter.

			Note that this argument takes precedence over
			the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.

	noiotrap	[SH] Disables trapped I/O port accesses.

	noirqdebug	[X86-32] Disables the code which attempts to detect and
@@ -4560,6 +4563,9 @@
			no-callback mode from boot but the mode may be
			toggled at runtime via cpusets.

			Note that this argument takes precedence over
			the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.

	rcu_nocb_poll	[KNL]
			Rather than requiring that offloaded CPUs
			(specified by rcu_nocbs= above) explicitly
@@ -4669,6 +4675,34 @@
			When RCU_NOCB_CPU is set, also adjust the
			priority of NOCB callback kthreads.

	rcutree.rcu_divisor= [KNL]
			Set the shift-right count to use to compute
			the callback-invocation batch limit bl from
			the number of callbacks queued on this CPU.
			The result will be bounded below by the value of
			the rcutree.blimit kernel parameter.  Every bl
			callbacks, the softirq handler will exit in
			order to allow the CPU to do other work.

			Please note that this callback-invocation batch
			limit applies only to non-offloaded callback
			invocation.  Offloaded callbacks are instead
			invoked in the context of an rcuoc kthread, which
			scheduler will preempt as it does any other task.

	rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
			On callback-offloaded (rcu_nocbs) CPUs,
			RCU reduces the lock contention that would
			otherwise be caused by callback floods through
			use of the ->nocb_bypass list.	However, in the
			common non-flooded case, RCU queues directly to
			the main ->cblist in order to avoid the extra
			overhead of the ->nocb_bypass list and its lock.
			But if there are too many callbacks queued during
			a single jiffy, RCU pre-queues the callbacks into
			the ->nocb_bypass queue.  The definition of "too
			many" is supplied by this kernel boot parameter.

	rcutree.rcu_nocb_gp_stride= [KNL]
			Set the number of NOCB callback kthreads in
			each group, which defaults to the square root
+3 −3
Original line number Diff line number Diff line
#
# Feature name:          context-tracking
#         Kconfig:       HAVE_CONTEXT_TRACKING
#         description:   arch supports context tracking for NO_HZ_FULL
# Feature name:          user-context-tracking
#         Kconfig:       HAVE_CONTEXT_TRACKING_USER
#         description:   arch supports user context tracking for NO_HZ_FULL
#
    -----------------------
    |         arch |status|
+1 −0
Original line number Diff line number Diff line
@@ -5165,6 +5165,7 @@ F: include/linux/console*
CONTEXT TRACKING
M:	Frederic Weisbecker <frederic@kernel.org>
M:	"Paul E. McKenney" <paulmck@kernel.org>
S:	Maintained
F:	kernel/context_tracking.c
F:	include/linux/context_tracking*
Loading