Commit 10a3efd0 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge tag 'perf-tools-for-v5.13-2021-04-29' of...

Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "perf stat:

   - Add support for hybrid PMUs to support systems such as Intel
     Alderlake and its BIG/little core/atom cpus.

   - Introduce 'bperf' to share hardware PMCs with BPF.

   - New --iostat option to collect and present IO stats on Intel
     hardware.

     This functionality is based on recently introduced sysfs attributes
     for Intel® Xeon® Scalable processor family (code name Skylake-SP)
     in commit bb42b3d3 ("perf/x86/intel/uncore: Expose an Uncore
     unit to IIO PMON mapping")

     It is intended to provide four I/O performance metrics in MB per
     each PCIe root port:

       - Inbound Read: I/O devices below root port read from the host memory
       - Inbound Write: I/O devices below root port write to the host memory
       - Outbound Read: CPU reads from I/O devices below root port
       - Outbound Write: CPU writes to I/O devices below root port

   - Align CSV output for summary.

   - Clarify --null use cases: Assess raw overhead of 'perf stat' or
     measure just wall clock time.

   - Improve readability of shadow stats.

  perf record:

   - Change the COMM when starting tha workload so that --exclude-perf
     doesn't seem to be not honoured.

   - Improve 'Workload failed' message printing events + what was
     exec'ed.

   - Fix cross-arch support for TIME_CONV.

  perf report:

   - Add option to disable raw event ordering.

   - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.

   - Improvements to --stat output, that shows information about
     PERF_RECORD_ events.

   - Preserve identifier id in OCaml demangler.

  perf annotate:

   - Show full source location with 'l' hotkey in the 'perf annotate'
     TUI.

   - Add line number like in TUI and source location at EOL to the 'perf
     annotate' --stdio mode.

   - Add --demangle and --demangle-kernel to 'perf annotate'.

   - Allow configuring annotate.demangle{,_kernel} in 'perf config'.

   - Fix sample events lost in stdio mode.

  perf data:

   - Allow converting a perf.data file to JSON.

  libperf:

   - Add support for user space counter access.

   - Update topdown documentation to permit rdpmc calls.

  perf test:

   - Add 'perf test' for 'perf stat' CSV output.

   - Add 'perf test' entries to test the hybrid PMU support.

   - Cleanup 'perf test daemon' if its 'perf test' is interrupted.

   - Handle metric reuse in pmu-events parsing 'perf test' entry.

   - Add test for PE executable support.

   - Add timeout for wait for daemon start in its 'perf test' entries.

  Build:

   - Enable libtraceevent dynamic linking.

   - Improve feature detection output.

   - Fix caching of feature checks caching.

   - First round of updates for tools copies of kernel headers.

   - Enable warnings when compiling BPF programs.

  Vendor specific events:

   - Intel:
      - Add missing skylake & icelake model numbers.

   - arm64:
      - Add Hisi hip08 L1, L2 and L3 metrics.
      - Add Fujitsu A64FX PMU events.

   - PowerPC:
      - Initial JSON/events list for power10 platform.
      - Remove unsupported power9 metrics.

   - AMD:
      - Add Zen3 events.
      - Fix broken L2 Cache Hits from L2 HWPF metric.
      - Use lowercases for all the eventcodes and umasks.

  Hardware tracing:

   - arm64:
      - Update CoreSight ETM metadata format.
      - Fix bitmap for CS-ETM option.
      - Support PID tracing in config.
      - Detect pid in VMID for kernel running at EL2.

  Arch specific updates:

   - MIPS:
      - Support MIPS unwinding and dwarf-regs.
      - Generate mips syscalls_n64.c syscall table.

   - PowerPC:
      - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
      - Support pipeline stage cycles for powerpc.

  libbeauty:

   - Fix fsconfig generator"

* tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits)
  perf build: Defer printing detected features to the end of all feature checks
  tools build: Allow deferring printing the results of feature detection
  perf build: Regenerate the FEATURE_DUMP file after extra feature checks
  perf session: Dump PERF_RECORD_TIME_CONV event
  perf session: Add swap operation for event TIME_CONV
  perf jit: Let convert_timestamp() to be backwards-compatible
  perf tools: Change fields type in perf_record_time_conv
  perf tools: Enable libtraceevent dynamic linking
  perf Documentation: Document intel-hybrid support
  perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
  perf tests: Support 'Convert perf time to TSC' test for hybrid
  perf tests: Support 'Session topology' test for hybrid
  perf tests: Support 'Parse and process metrics' test for hybrid
  perf tests: Support 'Track with sched_switch' test for hybrid
  perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
  perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
  perf tests: Add hybrid cases for 'Parse event definition strings' test
  perf record: Uniquify hybrid event name
  perf stat: Warn group events from different hybrid PMU
  perf stat: Filter out unmatched aggregation for hybrid event
  ...
parents 22650f14 c6e3bf43
Loading
Loading
Loading
Loading
+2 −0
Original line number Diff line number Diff line
@@ -14290,8 +14290,10 @@ R: Mark Rutland <mark.rutland@arm.com>
R:	Alexander Shishkin <alexander.shishkin@linux.intel.com>
R:	Jiri Olsa <jolsa@redhat.com>
R:	Namhyung Kim <namhyung@kernel.org>
L:	linux-perf-users@vger.kernel.org
L:	linux-kernel@vger.kernel.org
S:	Supported
W:	https://perf.wiki.kernel.org/
T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core
F:	arch/*/events/*
F:	arch/*/events/*/*
+18 −10
Original line number Diff line number Diff line
@@ -52,6 +52,7 @@ FEATURE_TESTS_BASIC := \
        libpython-version               \
        libslang                        \
        libslang-include-subdir         \
        libtraceevent                   \
        libcrypto                       \
        libunwind                       \
        pthread-attr-setaffinity-np     \
@@ -239,6 +240,8 @@ ifeq ($(VF),1)
  feature_verbose := 1
endif

feature_display_entries = $(eval $(feature_display_entries_code))
define feature_display_entries_code
  ifeq ($(feature_display),1)
    $(info )
    $(info Auto-detecting system features:)
@@ -253,3 +256,8 @@ ifeq ($(feature_verbose),1)
    $(foreach feat,$(TMP),$(call feature_print_status,$(feat),))
    $(info )
  endif
endef

ifeq ($(FEATURE_DISPLAY_DEFERRED),)
  $(call feature_display_entries)
endif
+4 −0
Original line number Diff line number Diff line
@@ -36,6 +36,7 @@ FILES= \
         test-libpython-version.bin             \
         test-libslang.bin                      \
         test-libslang-include-subdir.bin       \
         test-libtraceevent.bin                 \
         test-libcrypto.bin                     \
         test-libunwind.bin                     \
         test-libunwind-debug-frame.bin         \
@@ -196,6 +197,9 @@ $(OUTPUT)test-libslang.bin:
$(OUTPUT)test-libslang-include-subdir.bin:
	$(BUILD) -lslang

$(OUTPUT)test-libtraceevent.bin:
	$(BUILD) -ltraceevent

$(OUTPUT)test-libcrypto.bin:
	$(BUILD) -lcrypto

+12 −0
Original line number Diff line number Diff line
// SPDX-License-Identifier: GPL-2.0
#include <traceevent/trace-seq.h>

int main(void)
{
	int rv = 0;
	struct trace_seq s;
	trace_seq_init(&s);
	rv += !(s.state == TRACE_SEQ__GOOD);
	trace_seq_destroy(&s);
	return rv;
}
+75 −0
Original line number Diff line number Diff line
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_MATH64_H
#define _LINUX_MATH64_H

#include <linux/types.h>

#ifdef __x86_64__
static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
{
	u64 q;

	asm ("mulq %2; divq %3" : "=a" (q)
				: "a" (a), "rm" (b), "rm" (c)
				: "rdx");

	return q;
}
#define mul_u64_u64_div64 mul_u64_u64_div64
#endif

#ifdef __SIZEOF_INT128__
static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
{
	return (u64)(((unsigned __int128)a * b) >> shift);
}

#else

#ifdef __i386__
static inline u64 mul_u32_u32(u32 a, u32 b)
{
	u32 high, low;

	asm ("mull %[b]" : "=a" (low), "=d" (high)
			 : [a] "a" (a), [b] "rm" (b) );

	return low | ((u64)high) << 32;
}
#else
static inline u64 mul_u32_u32(u32 a, u32 b)
{
	return (u64)a * b;
}
#endif

static inline u64 mul_u64_u32_shr(u64 a, u32 b, unsigned int shift)
{
	u32 ah, al;
	u64 ret;

	al = a;
	ah = a >> 32;

	ret = mul_u32_u32(al, b) >> shift;
	if (ah)
		ret += mul_u32_u32(ah, b) << (32 - shift);

	return ret;
}

#endif	/* __SIZEOF_INT128__ */

#ifndef mul_u64_u64_div64
static inline u64 mul_u64_u64_div64(u64 a, u64 b, u64 c)
{
	u64 quot, rem;

	quot = a / c;
	rem = a % c;

	return quot * b + (rem * b) / c;
}
#endif

#endif /* _LINUX_MATH64_H */
Loading