Commit d392e49a authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull tracing tools updates from Steven Rostedt:

 - Use total duration to calculate average in rtla osnoise_hist

 - Use 2 digit precision for displaying average

 - Print an intuitive auto analysis of timerlat results

 - Add auto analysis to timerlat top

 - Add hwnoise, which is the same as osnoise but focuses on hardware

 - Small clean ups

* tag 'trace-tools-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation/rtla: Add hwnoise man page
  rtla: Add hwnoise tool
  Documentation/rtla: Add timerlat-top auto-analysis options
  rtla/timerlat: Add auto-analysis support to timerlat top
  rtla/timerlat: Add auto-analysis core
  tools/tracing/rtla: osnoise_hist: display average with two-digit precision
  tools/tracing/rtla: osnoise_hist: use total duration for average calculation
  tools/rv: Remove unneeded semicolon
parents 2562af68 5dc3750e
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
**--dump-tasks**

        prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)

**--no-aa**

        disable auto-analysis, reducing rtla timerlat cpu usage
+1 −0
Original line number Diff line number Diff line
@@ -17,6 +17,7 @@ behavior on specific hardware.
   rtla-timerlat
   rtla-timerlat-hist
   rtla-timerlat-top
   rtla-hwnoise

.. only::  subproject and html

+107 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

============
rtla-hwnoise
============
------------------------------------------
Detect and quantify hardware-related noise
------------------------------------------

:Manual section: 1

SYNOPSIS
========

**rtla hwnoise** [*OPTIONS*]

DESCRIPTION
===========

**rtla hwnoise** collects the periodic summary from the *osnoise* tracer
running with *interrupts disabled*. By disabling interrupts, and the scheduling
of threads as a consequence, only non-maskable interrupts and hardware-related
noise is allowed.

The tool also allows the configurations of the *osnoise* tracer and the
collection of the tracer output.

OPTIONS
=======
.. include:: common_osnoise_options.rst

.. include:: common_top_options.rst

.. include:: common_options.rst

EXAMPLE
=======
In the example below, the **rtla hwnoise** tool is set to run on CPUs *1-7*
on a system with 8 cores/16 threads with hyper-threading enabled.

The tool is set to detect any noise higher than *one microsecond*,
to run for *ten minutes*, displaying a summary of the report at the
end of the session::

  # rtla hwnoise -c 1-7 -T 1 -d 10m -q
                                          Hardware-related Noise
  duration:   0 00:10:00 | time is in us
  CPU Period       Runtime        Noise  % CPU Aval   Max Noise   Max Single          HW          NMI
    1 #599       599000000          138    99.99997           3            3           4           74
    2 #599       599000000           85    99.99998           3            3           4           75
    3 #599       599000000           86    99.99998           4            3           6           75
    4 #599       599000000           81    99.99998           4            4           2           75
    5 #599       599000000           85    99.99998           2            2           2           75
    6 #599       599000000           76    99.99998           2            2           0           75
    7 #599       599000000           77    99.99998           3            3           0           75


The first column shows the *CPU*, and the second column shows how many
*Periods* the tool ran during the session. The *Runtime* is the time
the tool effectively runs on the CPU. The *Noise* column is the sum of
all noise that the tool observed, and the *% CPU Aval* is the relation
between the *Runtime* and *Noise*.

The *Max Noise* column is the maximum hardware noise the tool detected in a
single period, and the *Max Single* is the maximum single noise seen.

The *HW* and *NMI* columns show the total number of *hardware* and *NMI* noise
occurrence observed by the tool.

For example, *CPU 3* ran *599* periods of *1 second Runtime*. The CPU received
*86 us* of noise during the entire execution, leaving *99.99997 %* of CPU time
for the application. In the worst single period, the CPU caused *4 us* of
noise to the application, but it was certainly caused by more than one single
noise, as the *Max Single* noise was of *3 us*. The CPU has *HW noise,* at a
rate of *six occurrences*/*ten minutes*. The CPU also has *NMIs*, at a higher
frequency: around *seven per second*.

The tool should report *0* hardware-related noise in the ideal situation.
For example, by disabling hyper-threading to remove the hardware noise,
and disabling the TSC watchdog to remove the NMI (it is possible to identify
this using tracing options of **rtla hwnoise**), it was possible to reach
the ideal situation in the same hardware::

  # rtla hwnoise -c 1-7 -T 1 -d 10m -q
                                          Hardware-related Noise
  duration:   0 00:10:00 | time is in us
  CPU Period       Runtime        Noise  % CPU Aval   Max Noise   Max Single          HW          NMI
    1 #599       599000000            0   100.00000           0            0           0            0
    2 #599       599000000            0   100.00000           0            0           0            0
    3 #599       599000000            0   100.00000           0            0           0            0
    4 #599       599000000            0   100.00000           0            0           0            0
    5 #599       599000000            0   100.00000           0            0           0            0
    6 #599       599000000            0   100.00000           0            0           0            0
    7 #599       599000000            0   100.00000           0            0           0            0

SEE ALSO
========

**rtla-osnoise**\(1)

Osnoise tracer documentation: <https://www.kernel.org/doc/html/latest/trace/osnoise-tracer.html>

AUTHOR
======
Written by Daniel Bristot de Oliveira <bristot@kernel.org>

.. include:: common_appendix.rst
+73 −91
Original line number Diff line number Diff line
@@ -30,102 +30,84 @@ OPTIONS

.. include:: common_options.rst

.. include:: common_timerlat_aa.rst

EXAMPLE
=======

In the example below, the *timerlat* tracer is set to capture the stack trace at
the IRQ handler, printing it to the buffer if the *Thread* timer latency is
higher than *30 us*. It is also set to stop the session if a *Thread* timer
latency higher than *30 us* is hit. Finally, it is set to save the trace
buffer if the stop condition is hit::
In the example below, the timerlat tracer is dispatched in cpus *1-23* in the
automatic trace mode, instructing the tracer to stop if a *40 us* latency or
higher is found::

  [root@alien ~]# rtla timerlat top -s 30 -T 30 -t
  # timerlat -a 40 -c 1-23 -q
                                     Timer Latency
    0 00:00:59   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
    0 00:00:12   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
  CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
    0 #58634     |        1         0         1        10 |       11         2        10        23
    1 #58634     |        1         0         1         9 |       12         2         9        23
    2 #58634     |        0         0         1        11 |       10         2         9        23
    3 #58634     |        1         0         1        11 |       11         2         9        24
    4 #58634     |        1         0         1        10 |       11         2         9        26
    5 #58634     |        1         0         1         8 |       10         2         9        25
    6 #58634     |       12         0         1        12 |       30         2        10        30 <--- CPU with spike
    7 #58634     |        1         0         1         9 |       11         2         9        23
    8 #58633     |        1         0         1         9 |       11         2         9        26
    9 #58633     |        1         0         1         9 |       10         2         9        26
   10 #58633     |        1         0         1        13 |       11         2         9        28
   11 #58633     |        1         0         1        13 |       12         2         9        24
   12 #58633     |        1         0         1         8 |       10         2         9        23
   13 #58633     |        1         0         1        10 |       10         2         9        22
   14 #58633     |        1         0         1        18 |       12         2         9        27
   15 #58633     |        1         0         1        10 |       11         2         9        28
   16 #58633     |        0         0         1        11 |        7         2         9        26
   17 #58633     |        1         0         1        13 |       10         2         9        24
   18 #58633     |        1         0         1         9 |       13         2         9        22
   19 #58633     |        1         0         1        10 |       11         2         9        23
   20 #58633     |        1         0         1        12 |       11         2         9        28
   21 #58633     |        1         0         1        14 |       11         2         9        24
   22 #58633     |        1         0         1         8 |       11         2         9        22
   23 #58633     |        1         0         1        10 |       11         2         9        27
  timerlat hit stop tracing
  saving trace to timerlat_trace.txt
  [root@alien bristot]# tail -60 timerlat_trace.txt
  [...]
      timerlat/5-79755   [005] .......   426.271226: #58634 context thread timer_latency     10823 ns
              sh-109404  [006] dnLh213   426.271247: #58634 context    irq timer_latency     12505 ns
              sh-109404  [006] dNLh313   426.271258: irq_noise: local_timer:236 start 426.271245463 duration 12553 ns
              sh-109404  [006] d...313   426.271263: thread_noise:       sh:109404 start 426.271245853 duration 4769 ns
      timerlat/6-79756   [006] .......   426.271264: #58634 context thread timer_latency     30328 ns
      timerlat/6-79756   [006] ....1..   426.271265: <stack trace>
  => timerlat_irq
  => __hrtimer_run_queues
  => hrtimer_interrupt
  => __sysvec_apic_timer_interrupt
  => sysvec_apic_timer_interrupt
  => asm_sysvec_apic_timer_interrupt
  => _raw_spin_unlock_irqrestore			<---- spinlock that disabled interrupt.
  => try_to_wake_up
  => autoremove_wake_function
  => __wake_up_common
  => __wake_up_common_lock
  => ep_poll_callback
  => __wake_up_common
  => __wake_up_common_lock
  => fsnotify_add_event
  => inotify_handle_inode_event
  => fsnotify
  => __fsnotify_parent
  => __fput
  => task_work_run
  => exit_to_user_mode_prepare
  => syscall_exit_to_user_mode
  => do_syscall_64
  => entry_SYSCALL_64_after_hwframe
  => 0x7265000001378c
  => 0x10000cea7
  => 0x25a00000204a
  => 0x12e302d00000000
  => 0x19b51010901b6
  => 0x283ce00726500
  => 0x61ea308872
  => 0x00000fe3
            bash-109109  [007] d..h...   426.271265: #58634 context    irq timer_latency      1211 ns
      timerlat/6-79756   [006] .......   426.271267: timerlat_main: stop tracing hit on cpu 6

In the trace, it is possible the notice that the *IRQ* timer latency was
already high, accounting *12505 ns*. The IRQ delay was caused by the
*bash-109109* process that disabled IRQs in the wake-up path
(*_try_to_wake_up()* function). The duration of the IRQ handler that woke
up the timerlat thread, informed with the **osnoise:irq_noise** event, was
also high and added more *12553 ns* to the Thread latency. Finally, the
**osnoise:thread_noise** added by the currently running thread (including
the scheduling overhead) added more *4769 ns*. Summing up these values,
the *Thread* timer latency accounted for *30328 ns*.

The primary reason for this high value is the wake-up path that was hit
twice during this case: when the *bash-109109* was waking up a thread
and then when the *timerlat* thread was awakened. This information can
then be used as the starting point of a more fine-grained analysis.
    1 #12322     |        0         0         1        15 |       10         3         9        31
    2 #12322     |        3         0         1        12 |       10         3         9        23
    3 #12322     |        1         0         1        21 |        8         2         8        34
    4 #12322     |        1         0         1        17 |       10         2        11        33
    5 #12322     |        0         0         1        12 |        8         3         8        25
    6 #12322     |        1         0         1        14 |       16         3        11        35
    7 #12322     |        0         0         1        14 |        9         2         8        29
    8 #12322     |        1         0         1        22 |        9         3         9        34
    9 #12322     |        0         0         1        14 |        8         2         8        24
   10 #12322     |        1         0         0        12 |        9         3         8        24
   11 #12322     |        0         0         0        15 |        6         2         7        29
   12 #12321     |        1         0         0        13 |        5         3         8        23
   13 #12319     |        0         0         1        14 |        9         3         9        26
   14 #12321     |        1         0         0        13 |        6         2         8        24
   15 #12321     |        1         0         1        15 |       12         3        11        27
   16 #12318     |        0         0         1        13 |        7         3        10        24
   17 #12319     |        0         0         1        13 |       11         3         9        25
   18 #12318     |        0         0         0        12 |        8         2         8        20
   19 #12319     |        0         0         1        18 |       10         2         9        28
   20 #12317     |        0         0         0        20 |        9         3         8        34
   21 #12318     |        0         0         0        13 |        8         3         8        28
   22 #12319     |        0         0         1        11 |        8         3        10        22
   23 #12320     |       28         0         1        28 |       41         3        11        41
  rtla timerlat hit stop tracing
  ## CPU 23 hit stop tracing, analyzing it ##
  IRQ handler delay:                                        27.49 us (65.52 %)
  IRQ latency:                                              28.13 us
  Timerlat IRQ duration:                                     9.59 us (22.85 %)
  Blocking thread:                                           3.79 us (9.03 %)
                         objtool:49256                       3.79 us
    Blocking thread stacktrace
                -> timerlat_irq
                -> __hrtimer_run_queues
                -> hrtimer_interrupt
                -> __sysvec_apic_timer_interrupt
                -> sysvec_apic_timer_interrupt
                -> asm_sysvec_apic_timer_interrupt
                -> _raw_spin_unlock_irqrestore
                -> cgroup_rstat_flush_locked
                -> cgroup_rstat_flush_irqsafe
                -> mem_cgroup_flush_stats
                -> mem_cgroup_wb_stats
                -> balance_dirty_pages
                -> balance_dirty_pages_ratelimited_flags
                -> btrfs_buffered_write
                -> btrfs_do_write_iter
                -> vfs_write
                -> __x64_sys_pwrite64
                -> do_syscall_64
                -> entry_SYSCALL_64_after_hwframe
  ------------------------------------------------------------------------
    Thread latency:                                          41.96 us (100%)

  The system has exit from idle latency!
    Max timerlat IRQ latency from idle: 17.48 us in cpu 4
  Saving trace to timerlat_trace.txt

In this case, the major factor was the delay suffered by the *IRQ handler*
that handles **timerlat** wakeup: *65.52%*. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (*objtool:49256*) disabled interrupts
via *raw spin lock* operations inside mem cgroup, while doing write
syscall in a btrfs file system.

The raw trace is saved in the **timerlat_trace.txt** file for further analysis.

Note that **rtla timerlat** was dispatched without changing *timerlat* tracer
threads' priority. That is generally not needed because these threads hava
+2 −0
Original line number Diff line number Diff line
@@ -119,6 +119,8 @@ install: doc_install
	$(STRIP) $(DESTDIR)$(BINDIR)/rtla
	@test ! -f $(DESTDIR)$(BINDIR)/osnoise || rm $(DESTDIR)$(BINDIR)/osnoise
	ln -s rtla $(DESTDIR)$(BINDIR)/osnoise
	@test ! -f $(DESTDIR)$(BINDIR)/hwnoise || rm $(DESTDIR)$(BINDIR)/hwnoise
	ln -s rtla $(DESTDIR)$(BINDIR)/hwnoise
	@test ! -f $(DESTDIR)$(BINDIR)/timerlat || rm $(DESTDIR)$(BINDIR)/timerlat
	ln -s rtla $(DESTDIR)$(BINDIR)/timerlat

Loading