Skip to content
  1. Apr 29, 2021
    • Arnaldo Carvalho de Melo's avatar
      perf build: Defer printing detected features to the end of all feature checks · c6e3bf43
      Arnaldo Carvalho de Melo authored
      
      
      We were doing it in tools/build/Makefile.feature, after running the
      feature checks, but then in tools/perf/Makefile.config we can call more
      feature checks when we notice that some feature check failed, like when
      libbfd wasn't detected and we add libraries to the LDFLAGS of its
      feature check to try again, etc.
      
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c6e3bf43
    • Arnaldo Carvalho de Melo's avatar
      tools build: Allow deferring printing the results of feature detection · 19177bc3
      Arnaldo Carvalho de Melo authored
      
      
      By setting FEATURE_DISPLAY_DEFERRED=1 a tool may ask for the printout
      of the detected features in tools/build/Makefile.feature to be done
      later adter extra feature checks are done that are tool specific.
      
      The perf tool will do it via its tools/perf/Makefile.config, as it
      performs such extra feature checks.
      
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19177bc3
    • Jiri Olsa's avatar
      perf build: Regenerate the FEATURE_DUMP file after extra feature checks · fbed59f8
      Jiri Olsa authored
      
      
      Feature detection is done in tools/build/Makefile.feature, we may exit
      there with some features not detected and then, in
      tools/perf/Makefile.config try adding extra libraries to link and then
      do extra feature checks to see if we now find the feature.
      
      This is the case with the disassembler-four-args that checks if the
      diassembler() function in libopcodes (binutils) has a signature with
      one or with four arguments, as this is not ABI and they changed it at
      some point.
      
      This is not a problem when doing normal builds, for instance:
      
        $ make -C tools/perf O=/tmp/build/perf
      
      As we don't use what is in FEATURE-DUMP at that point, but is a problem
      if we pass FEATURE_DUMP=/previously-detected-features as we do in
      'make -C tools/perf build-test' to reuse the feature detection in the
      many build combinations we test there.
      
      When that is done feature-disassembler-four-args will be set to 0, but
      opensuse 15.1 has the four arguments function signature in
      disassembler(). The build thus fails.
      
      Fix it by rewriting the FEATURE-DUMP file at the end of
      tools/perf/Makefile.config to register features we retested in that make
      file.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fbed59f8
    • Leo Yan's avatar
      perf session: Dump PERF_RECORD_TIME_CONV event · 81e70d7e
      Leo Yan authored
      
      
      Now perf tool uses the common stub function process_event_op2_stub() for
      dumping TIME_CONV event, thus it doesn't output the clock parameters
      contained in the event.
      
      This patch adds the callback function for dumping the hardware clock
      parameters in TIME_CONV event.
      
      Before:
      
        # perf report -D
      
        0x978 [0x38]: event: 79
        .
        . ... raw event: size 56 bytes
        .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
        .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
        .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
        .  0030:  01 01 00 00 00 00 00 00                          ........
      
        0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
        : unhandled!
      
        [...]
      
      After:
      
        # perf report -D
      
        0x978 [0x38]: event: 79
        .
        . ... raw event: size 56 bytes
        .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
        .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
        .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
        .  0030:  01 01 00 00 00 00 00 00                          ........
      
        0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
        ... Time Shift      21
        ... Time Muliplier  20971520
        ... Time Zero       18446743935180835206
        ... Time Cycles     13852918225
        ... Time Mask       0xffffffffffffff
        ... Cap Time Zero   1
        ... Cap Time Short  1
        : unhandled!
      
        [...]
      
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-5-leo.yan@linaro.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      81e70d7e
    • Leo Yan's avatar
      perf session: Add swap operation for event TIME_CONV · 050ffc44
      Leo Yan authored
      Since commit d110162c ("perf tsc: Support cap_user_time_short for
      event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data
      structure for clock parameters.
      
      To be backwards-compatible, this patch adds a dedicated swap operation
      for the event PERF_RECORD_TIME_CONV, based on checking if the event
      contains field "time_cycles", it can support both for the old and new
      event formats.
      
      Fixes: d110162c
      
       ("perf tsc: Support cap_user_time_short for event TIME_CONV")
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      050ffc44
    • Leo Yan's avatar
      perf jit: Let convert_timestamp() to be backwards-compatible · aa616f5a
      Leo Yan authored
      Commit d110162c ("perf tsc: Support cap_user_time_short for
      event TIME_CONV") supports the extended parameters for event TIME_CONV,
      but it broke the backwards compatibility, so any perf data file with old
      event format fails to convert timestamp.
      
      This patch introduces a helper event_contains() to check if an event
      contains a specific member or not.  For the backwards-compatibility, if
      the event size confirms the extended parameters are supported in the
      event TIME_CONV, then copies these parameters.
      
      Committer notes:
      
      To make this compiler backwards compatible add this patch:
      
        -       struct perf_tsc_conversion tc = { 0 };
        +       struct perf_tsc_conversion tc = { .time_shift = 0, };
      
      Fixes: d110162c
      
       ("perf tsc: Support cap_user_time_short for event TIME_CONV")
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa616f5a
    • Leo Yan's avatar
      perf tools: Change fields type in perf_record_time_conv · e1d380ea
      Leo Yan authored
      C standard claims "An object declared as type _Bool is large enough to
      store the values 0 and 1", bool type size can be 1 byte or larger than
      1 byte.  Thus it's uncertian for bool type size with different
      compilers.
      
      This patch changes the bool type in structure perf_record_time_conv to
      __u8 type, and pads extra bytes for 8-byte alignment; this can give
      reliable structure size.
      
      Fixes: d110162c
      
       ("perf tsc: Support cap_user_time_short for event TIME_CONV")
      Suggested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-2-leo.yan@linaro.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e1d380ea
    • Michael Petlan's avatar
      perf tools: Enable libtraceevent dynamic linking · 56d32d4c
      Michael Petlan authored
      
      
      Currently we support only static linking with kernel's libtraceevent
      (tools/lib/traceevent). This patch adds libtraceevent package detection
      and support to link perf with it dynamically.
      
        The libtraceevent package status is displayed with:
        $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
        ...
        ...                 libtraceevent: [ on  ]
      
      Default behavior remains the same (static linking).
      
      Committer testing:
      
        $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
        Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
        $
      
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56d32d4c
    • Jin Yao's avatar
      perf Documentation: Document intel-hybrid support · 2750ce1d
      Jin Yao authored
      
      
      Add some words and examples to help understanding of
      Intel hybrid perf support.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2750ce1d
    • Jin Yao's avatar
      perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid · a37f3b88
      Jin Yao authored
      
      
      Currently we don't support shadow stat for hybrid.
      
        root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
            12,883,109,591      cpu_core/cycles/
             6,405,163,221      cpu_atom/cycles/
               555,553,778      cpu_core/instructions/
               841,158,734      cpu_atom/instructions/
      
               1.002644773 seconds time elapsed
      
      Now there is no shadow stat 'insn per cycle' reported. We will support
      it later and now just skip the 'perf stat metrics (shadow stat) test'.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-26-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a37f3b88
    • Jin Yao's avatar
      perf tests: Support 'Convert perf time to TSC' test for hybrid · d9da6f70
      Jin Yao authored
      
      
      Since for "cycles:u' on hybrid platform, it creates two "cycles".  So
      the second evsel in evlist also needs initialization.
      
      With this patch,
      
        # ./perf test 71
        71: Convert perf time to TSC                                        : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-25-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d9da6f70
    • Jin Yao's avatar
      perf tests: Support 'Session topology' test for hybrid · c1020388
      Jin Yao authored
      
      
      Force to create one event "cpu_core/cycles/" by default, otherwise in
      evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries
      == 1)' would be failed.
      
        # ./perf test 41
        41: Session topology                                                : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-24-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1020388
    • Jin Yao's avatar
      perf tests: Support 'Parse and process metrics' test for hybrid · 6081e876
      Jin Yao authored
      
      
      Some events are not supported. Only pick up some cases for hybrid.
      
        # ./perf test 68
        68: Parse and process metrics                                       : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-23-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6081e876
    • Jin Yao's avatar
      perf tests: Support 'Track with sched_switch' test for hybrid · 43eb05d0
      Jin Yao authored
      
      
      Since for "cycles:u' on hybrid platform, it creates two "cycles".
      So the number of events in evlist is not expected in next test
      steps. Now we just use one event "cpu_core/cycles:u/" for hybrid.
      
        # ./perf test 35
        35: Track with sched_switch                                         : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-22-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      43eb05d0
    • Jin Yao's avatar
      perf tests: Skip 'Setup struct perf_event_attr' test for hybrid · f15da0b1
      Jin Yao authored
      
      
      For hybrid, the attr.type consists of pmu type id + original type.
      There will be much changes for this test. Now we temporarily
      skip this test case and TODO in future.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-21-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f15da0b1
    • Jin Yao's avatar
      perf tests: Add hybrid cases for 'Roundtrip evsel->name' test · afff9f31
      Jin Yao authored
      
      
      Since for one hw event, two hybrid events are created.
      
      For example,
      
      evsel->idx      evsel__name(evsel)
      0               cycles
      1               cycles
      2               instructions
      3               instructions
      ...
      
      So for comparing the evsel name on hybrid, the evsel->idx
      needs to be divided by 2.
      
        # ./perf test 14
        14: Roundtrip evsel->name                                           : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-20-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      afff9f31
    • Jin Yao's avatar
      perf tests: Add hybrid cases for 'Parse event definition strings' test · 2541cb63
      Jin Yao authored
      
      
      Add basic hybrid test cases for 'Parse event definition strings' test.
      
        # perf test 6
         6: Parse event definition strings                                  : Ok
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-19-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2541cb63
    • Jin Yao's avatar
      perf record: Uniquify hybrid event name · 91c0f5ec
      Jin Yao authored
      
      
      For perf-record, it would be useful to tell user the pmu which the
      event belongs to.
      
      For example,
      
        # perf record -a -- sleep 1
        # perf report
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 106  of event 'cpu_core/cycles/'
        # Event count (approx.): 22043448
        #
        # Overhead  Command       Shared Object            Symbol
        # ........  ............  .......................  ............................
        #
        ...
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      91c0f5ec
    • Jin Yao's avatar
      perf stat: Warn group events from different hybrid PMU · 660e533e
      Jin Yao authored
      
      
      If a group has events which are from different hybrid PMUs,
      shows a warning:
      
      "WARNING: events in group from different hybrid PMUs!"
      
      This is to remind the user not to put the core event and atom
      event into one group.
      
      Next, just disable grouping.
      
        # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
        WARNING: events in group from different hybrid PMUs!
        WARNING: grouped events cpus do not match, disabling group:
          anon group { cpu_core/cycles/, cpu_atom/cycles/ }
      
         Performance counter stats for 'system wide':
      
                 5,438,125      cpu_core/cycles/
                 3,914,586      cpu_atom/cycles/
      
               1.004250966 seconds time elapsed
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      660e533e
    • Jin Yao's avatar
      perf stat: Filter out unmatched aggregation for hybrid event · 92637cc7
      Jin Yao authored
      
      
      perf-stat has supported some aggregation modes, such as --per-core,
      --per-socket and etc. While for hybrid event, it may only available
      on part of cpus. So for --per-core, we need to filter out the
      unavailable cores, for --per-socket, filter out the unavailable
      sockets, and so on.
      
      Before:
      
        # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
        S0-D0-C0           2            479,530      cpu_core/cycles/
        S0-D0-C4           2            175,007      cpu_core/cycles/
        S0-D0-C8           2            166,240      cpu_core/cycles/
        S0-D0-C12          2            704,673      cpu_core/cycles/
        S0-D0-C16          2            865,835      cpu_core/cycles/
        S0-D0-C20          2          2,958,461      cpu_core/cycles/
        S0-D0-C24          2            163,988      cpu_core/cycles/
        S0-D0-C28          2            164,729      cpu_core/cycles/
        S0-D0-C32          0      <not counted>      cpu_core/cycles/
        S0-D0-C33          0      <not counted>      cpu_core/cycles/
        S0-D0-C34          0      <not counted>      cpu_core/cycles/
        S0-D0-C35          0      <not counted>      cpu_core/cycles/
        S0-D0-C36          0      <not counted>      cpu_core/cycles/
        S0-D0-C37          0      <not counted>      cpu_core/cycles/
        S0-D0-C38          0      <not counted>      cpu_core/cycles/
        S0-D0-C39          0      <not counted>      cpu_core/cycles/
      
               1.003597211 seconds time elapsed
      
      After:
      
        # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
        S0-D0-C0           2            210,428      cpu_core/cycles/
        S0-D0-C4           2            444,830      cpu_core/cycles/
        S0-D0-C8           2            435,241      cpu_core/cycles/
        S0-D0-C12          2            423,976      cpu_core/cycles/
        S0-D0-C16          2            859,350      cpu_core/cycles/
        S0-D0-C20          2          1,559,589      cpu_core/cycles/
        S0-D0-C24          2            163,924      cpu_core/cycles/
        S0-D0-C28          2            376,610      cpu_core/cycles/
      
               1.003621290 seconds time elapsed
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Co-developed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-16-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92637cc7
    • Jin Yao's avatar
      perf stat: Add default hybrid events · ac2dc29e
      Jin Yao authored
      
      
      Previously if '-e' is not specified in perf stat, some software events
      and hardware events are added to evlist by default.
      
      Before:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,044.40 msec cpu-clock                 #   23.946 CPUs utilized
                        99      context-switches          #    4.117 /sec
                        24      cpu-migrations            #    0.998 /sec
                         3      page-faults               #    0.125 /sec
                 7,000,244      cycles                    #    0.000 GHz
                 2,955,024      instructions              #    0.42  insn per cycle
                   608,941      branches                  #   25.326 K/sec
                    31,991      branch-misses             #    5.25% of all branches
      
               1.004106859 seconds time elapsed
      
      Among the events, cycles, instructions, branches and branch-misses
      are hardware events.
      
      One hybrid platform, two hardware events are created for one
      hardware event.
      
      cpu_core/cycles/,
      cpu_atom/cycles/,
      cpu_core/instructions/,
      cpu_atom/instructions/,
      cpu_core/branches/,
      cpu_atom/branches/,
      cpu_core/branch-misses/,
      cpu_atom/branch-misses/
      
      These events would be added to evlist on hybrid platform.
      
      Since parse_events() has been supported to create two hardware events
      for one event on hybrid platform, so we just use parse_events(evlist,
      "cycles,instructions,branches,branch-misses") to create the default
      events and add them to evlist.
      
      After:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,043.99 msec cpu-clock                 #   23.991 CPUs utilized
                       139      context-switches          #    5.781 /sec
                        25      cpu-migrations            #    1.040 /sec
                         6      page-faults               #    0.250 /sec
                10,381,751      cpu_core/cycles/          #  431.782 K/sec
                 1,264,216      cpu_atom/cycles/          #   52.579 K/sec
                 3,406,958      cpu_core/instructions/    #  141.697 K/sec
                   414,588      cpu_atom/instructions/    #   17.243 K/sec
                   705,149      cpu_core/branches/        #   29.327 K/sec
                    82,358      cpu_atom/branches/        #    3.425 K/sec
                    40,821      cpu_core/branch-misses/   #    1.698 K/sec
                     9,086      cpu_atom/branch-misses/   #  377.891 /sec
      
               1.002228863 seconds time elapsed
      
      We can see two events are created for one hardware event.
      
      One TODO is, the shadow stats looks a bit different, now it's just
      'M/sec'.
      
      The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
      need to be improved in future if we want to get the original shadow
      stats.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ac2dc29e
    • Jin Yao's avatar
      perf record: Create two hybrid 'cycles' events by default · b53a0755
      Jin Yao authored
      
      
      When evlist is empty, for example no '-e' specified in perf record,
      one default 'cycles' event is added to evlist.
      
      While on hybrid platform, it needs to create two default 'cycles'
      events. One is for cpu_core, the other is for cpu_atom.
      
      This patch actually calls evsel__new_cycles() two times to create
      two 'cycles' events.
      
        # ./perf record -vv -a -- sleep 1
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
        sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
        sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
        sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
        sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
        sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
        sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
        sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
        sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
        sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
        sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
        sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
        sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
        sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
        sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
        sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
        sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
        sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
        ------------------------------------------------------------
      
      We have to create evlist-hybrid.c otherwise due to the symbol
      dependency the perf test python would be failed.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b53a0755
    • Jin Yao's avatar
      perf parse-events: Support event inside hybrid pmu · 5e4edd1f
      Jin Yao authored
      
      
      On hybrid platform, user may want to enable events on one pmu.
      
      Following syntax are supported:
      
      cpu_core/<event>/
      cpu_atom/<event>/
      
      But the syntax doesn't work for cache event.
      
      Before:
      
        # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
        event syntax error: 'cpu_core/LLC-loads/'
                                      \___ unknown term 'LLC-loads' for pmu 'cpu_core'
      
      Cache events are a bit complex. We can't create aliases for them.
      We use another solution. For example, if we use "cpu_core/LLC-loads/",
      in parse_events_add_pmu(), term->config is "LLC-loads".
      
      Then we create a new parser to scan "LLC-loads". The
      parse_events_add_cache() would be called during parsing.
      The parse_state->hybrid_pmu_name is used to identify the pmu
      where the event should be enabled on.
      
      After:
      
        # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                    24,593      cpu_core/LLC-loads/
      
               1.003911601 seconds time elapsed
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5e4edd1f
    • Jin Yao's avatar
      perf parse-events: Compare with hybrid pmu name · c93afadc
      Jin Yao authored
      
      
      On hybrid platform, user may want to enable event only on one pmu.
      Following syntax will be supported:
      
      cpu_core/<event>/
      cpu_atom/<event>/
      
      For hardware event, hardware cache event and raw event, two events
      are created by default. We pass the specified pmu name in parse_state
      and it would be checked before event creation. So next only the
      event with the specified pmu would be created.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c93afadc
    • Jin Yao's avatar
      perf parse-events: Create two hybrid raw events · 94da591b
      Jin Yao authored
      
      
      On hybrid platform, same raw event is possible to be available
      on both cpu_core pmu and cpu_atom pmu. It's supported to create
      two raw events for one event encoding. For raw events, the
      attr.type is PMU type.
      
        # perf stat -e r3c -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        r3c: 0: 434449 1001412521 1001412521
        r3c: 1: 173162 1001482031 1001482031
        r3c: 2: 231710 1001524974 1001524974
        r3c: 3: 110012 1001563523 1001563523
        r3c: 4: 191517 1001593221 1001593221
        r3c: 5: 956458 1001628147 1001628147
        r3c: 6: 416969 1001715626 1001715626
        r3c: 7: 1047527 1001596650 1001596650
        r3c: 8: 103877 1001633520 1001633520
        r3c: 9: 70571 1001637898 1001637898
        r3c: 10: 550284 1001714398 1001714398
        r3c: 11: 1257274 1001738349 1001738349
        r3c: 12: 107797 1001801432 1001801432
        r3c: 13: 67471 1001836281 1001836281
        r3c: 14: 286782 1001923161 1001923161
        r3c: 15: 815509 1001952550 1001952550
        r3c: 0: 95994 1002071117 1002071117
        r3c: 1: 105570 1002142438 1002142438
        r3c: 2: 115921 1002189147 1002189147
        r3c: 3: 72747 1002238133 1002238133
        r3c: 4: 103519 1002276753 1002276753
        r3c: 5: 121382 1002315131 1002315131
        r3c: 6: 80298 1002248050 1002248050
        r3c: 7: 466790 1002278221 1002278221
        r3c: 6821369 16026754282 16026754282
        r3c: 1162221 8017758990 8017758990
      
         Performance counter stats for 'system wide':
      
                 6,821,369      cpu_core/r3c/
                 1,162,221      cpu_atom/r3c/
      
               1.002289965 seconds time elapsed
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      94da591b
    • Jin Yao's avatar
      perf parse-events: Create two hybrid cache events · 30def61f
      Jin Yao authored
      
      
      For cache events, they have pre-defined configs. The kernel needs
      to know where the cache event comes from (e.g. from cpu_core pmu
      or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
      can't carry pmu information.
      
      Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
      The PMU type ID is stored at attr.config[63:32].
      
      When enabling a hybrid cache event without specified pmu, such as,
      'perf stat -e LLC-loads -a', two events are created
      automatically. One is for atom, the other is for core.
      
        # perf stat -e LLC-loads -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        LLC-loads: 0: 1507 1001800280 1001800280
        LLC-loads: 1: 666 1001812250 1001812250
        LLC-loads: 2: 3353 1001813453 1001813453
        LLC-loads: 3: 514 1001848795 1001848795
        LLC-loads: 4: 627 1001952832 1001952832
        LLC-loads: 5: 4399 1001451154 1001451154
        LLC-loads: 6: 1240 1001481052 1001481052
        LLC-loads: 7: 478 1001520348 1001520348
        LLC-loads: 8: 691 1001551236 1001551236
        LLC-loads: 9: 310 1001578945 1001578945
        LLC-loads: 10: 1018 1001594354 1001594354
        LLC-loads: 11: 3656 1001622355 1001622355
        LLC-loads: 12: 882 1001661416 1001661416
        LLC-loads: 13: 506 1001693963 1001693963
        LLC-loads: 14: 3547 1001721013 1001721013
        LLC-loads: 15: 1399 1001734818 1001734818
        LLC-loads: 0: 1314 1001793826 1001793826
        LLC-loads: 1: 2857 1001752764 1001752764
        LLC-loads: 2: 646 1001830694 1001830694
        LLC-loads: 3: 1612 1001864861 1001864861
        LLC-loads: 4: 2244 1001912381 1001912381
        LLC-loads: 5: 1255 1001943889 1001943889
        LLC-loads: 6: 4624 1002021109 1002021109
        LLC-loads: 7: 2703 1001959302 1001959302
        LLC-loads: 24793 16026838264 16026838264
        LLC-loads: 17255 8015078826 8015078826
      
         Performance counter stats for 'system wide':
      
                    24,793      cpu_core/LLC-loads/
                    17,255      cpu_atom/LLC-loads/
      
               1.001970988 seconds time elapsed
      
      0x4 in 0x400000002 indicates the cpu_core pmu.
      0x8 in 0x800000002 indicates the cpu_atom pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      30def61f
    • Jin Yao's avatar
      perf parse-events: Create two hybrid hardware events · 9cbfa2f6
      Jin Yao authored
      
      
      Current hardware events has special perf types PERF_TYPE_HARDWARE.
      But it doesn't pass the PMU type in the user interface. For a hybrid
      system, the perf kernel doesn't know which PMU the events belong to.
      
      So now this type is extended to be PMU aware type. The PMU type ID
      is stored at attr.config[63:32].
      
      PMU type ID is retrieved from sysfs.
      
        root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
        8
      
        root@lkp-adl-d01:/sys/devices/cpu_core# cat type
        4
      
      When enabling a hybrid hardware event without specified pmu, such as,
      'perf stat -e cycles -a', two events are created automatically. One
      is for atom, the other is for core.
      
        # perf stat -e cycles -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        cycles: 0: 836272 1001525722 1001525722
        cycles: 1: 628564 1001580453 1001580453
        cycles: 2: 872693 1001605997 1001605997
        cycles: 3: 70417 1001641369 1001641369
        cycles: 4: 88593 1001726722 1001726722
        cycles: 5: 470495 1001752993 1001752993
        cycles: 6: 484733 1001840440 1001840440
        cycles: 7: 1272477 1001593105 1001593105
        cycles: 8: 209185 1001608616 1001608616
        cycles: 9: 204391 1001633962 1001633962
        cycles: 10: 264121 1001661745 1001661745
        cycles: 11: 826104 1001689904 1001689904
        cycles: 12: 89935 1001728861 1001728861
        cycles: 13: 70639 1001756757 1001756757
        cycles: 14: 185266 1001784810 1001784810
        cycles: 15: 171094 1001825466 1001825466
        cycles: 0: 129624 1001854843 1001854843
        cycles: 1: 122533 1001840421 1001840421
        cycles: 2: 90055 1001882506 1001882506
        cycles: 3: 139607 1001896463 1001896463
        cycles: 4: 141791 1001907838 1001907838
        cycles: 5: 530927 1001883880 1001883880
        cycles: 6: 143246 1001852529 1001852529
        cycles: 7: 667769 1001872626 1001872626
        cycles: 6744979 16026956922 16026956922
        cycles: 1965552 8014991106 8014991106
      
         Performance counter stats for 'system wide':
      
                 6,744,979      cpu_core/cycles/
                 1,965,552      cpu_atom/cycles/
      
               1.001882711 seconds time elapsed
      
      0x4 in 0x400000000 indicates the cpu_core pmu.
      0x8 in 0x800000000 indicates the cpu_atom pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9cbfa2f6
    • Jin Yao's avatar
      perf stat: Uniquify hybrid event name · 12279429
      Jin Yao authored
      
      
      It would be useful to let user know the pmu which the event belongs to.
      perf-stat has supported '--no-merge' option and it can print the pmu
      name after the event name, such as:
      
      "cycles [cpu_core]"
      
      Now this option is enabled by default for hybrid platform but change
      the format to:
      
      "cpu_core/cycles/"
      
      If user configs the name, we still use the user specified name.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      12279429
    • Jin Yao's avatar
      perf pmu: Add hybrid helper functions · c5a26ea4
      Jin Yao authored
      
      
      The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
      can be used to identify the hybrid platform and return the found
      hybrid cpu pmu. All the detected hybrid pmus have been saved in
      'perf_pmu__hybrid_pmus' list. So we just need to search this list.
      
      perf_pmu__hybrid_type_to_pmu converts the user specified string
      to hybrid pmu name. This is used to support the '--cputype' option
      in next patches.
      
      perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that,
      we have to define it in pmu.c (make pmu-hybrid.c no more symbol
      dependency), otherwise perf test python would be failed.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5a26ea4
    • Jin Yao's avatar
      perf pmu: Save detected hybrid pmus to a global pmu list · 44462430
      Jin Yao authored
      
      
      We identify the cpu_core pmu and cpu_atom pmu by explicitly
      checking following files:
      
      For cpu_core, checks:
      "/sys/bus/event_source/devices/cpu_core/cpus"
      
      For cpu_atom, checks:
      "/sys/bus/event_source/devices/cpu_atom/cpus"
      
      If the 'cpus' file exists and it has data, the pmu exists.
      
      But in order not to hardcode the "cpu_core" and "cpu_atom",
      and make the code in a generic way.
      
      So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the
      hybrid pmu exists. All the detected hybrid pmus are linked to a global
      list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the
      list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44462430
    • Jin Yao's avatar
      perf pmu: Save pmu name · 32705de7
      Jin Yao authored
      
      
      On hybrid platform, one event is available on one pmu
      (such as, available on cpu_core or on cpu_atom).
      
      This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
      Then next we can know the pmu which the event can be enabled on.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      32705de7
    • Jin Yao's avatar
      perf pmu: Simplify arguments of __perf_pmu__new_alias · eab35953
      Jin Yao authored
      
      
      Simplify the arguments of __perf_pmu__new_alias() by passing the whole
      'struct pme_event' pointer.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eab35953
    • Jin Yao's avatar
      perf jevents: Support unit value "cpu_core" and "cpu_atom" · 6b64833b
      Jin Yao authored
      
      
      For some Intel platforms, such as Alderlake, which is a hybrid platform
      and it consists of atom cpu and core cpu. Each cpu has dedicated event
      list. Part of events are available on core cpu, part of events are
      available on atom cpu.
      
      The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in
      json is added with a new field "Unit" to indicate which pmu the event
      is available on.
      
      For example, one event in cache.json,
      
          {
              "BriefDescription": "Counts the number of load ops retired that",
              "CollectPEBSRecord": "2",
              "Counter": "0,1,2,3",
              "EventCode": "0xd2",
              "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO",
              "PEBScounters": "0,1,2,3",
              "SampleAfterValue": "1000003",
              "UMask": "0x80",
              "Unit": "cpu_atom"
          },
      
      The unit "cpu_atom" indicates this event is only available on "cpu_atom".
      
      In generated pmu-events.c, we can see:
      
      {
              .name = "mem_load_uops_retired_misc.mmio",
              .event = "period=1000003,umask=0x80,event=0xd2",
              .desc = "Counts the number of load ops retired that. Unit: cpu_atom ",
              .topic = "cache",
              .pmu = "cpu_atom",
      },
      
      But if without this patch, the "uncore_" prefix is added before "cpu_atom",
      such as:
              .pmu = "uncore_cpu_atom"
      
      That would be a wrong pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6b64833b
    • Jin Yao's avatar
      tools headers uapi: Update tools's copy of linux/perf_event.h · 41273611
      Jin Yao authored
      To get the changes in:
      
      Liang Kan's patch
      
        55bcf6ef
      
       ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
      
      Kan's patch is in the tip/perf/core branch.
      
      So the next perf tool patches need this interface for hybrid support.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-2-yao.jin@linux.intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41273611
    • Namhyung Kim's avatar
      perf report: Print percentage of each event statistics · 462f57db
      Namhyung Kim authored
      
      
      It's sometimes useful to see how many samples vs other events in the
      data file with percent values.
      
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:      20064
                    MMAP events:        239  ( 1.2%)
                    COMM events:       1518  ( 7.6%)
                    EXIT events:          1  ( 0.0%)
                    FORK events:       1517  ( 7.6%)
                  SAMPLE events:       4015  (20.0%)
                   MMAP2 events:      12769  (63.6%)
          FINISHED_ROUND events:          2  ( 0.0%)
              THREAD_MAP events:          1  ( 0.0%)
                 CPU_MAP events:          1  ( 0.0%)
               TIME_CONV events:          1  ( 0.0%)
        cycles stats:
                  SAMPLE events:       2475
        instructions stats:
                  SAMPLE events:       1540
      
      Suggested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      462f57db
    • Namhyung Kim's avatar
      perf report: Make --skip-empty as default · 8f08cf33
      Namhyung Kim authored
      
      
      so that the compact output is shown by default.  Also add 'report.skip-empty'
      config option to override the default.  Users can also use --no-skip-empty
      command line option to change the behavior anytime.
      
      Committer testing:
      
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:         19
                    COMM events:          2
                    EXIT events:          1
                  SAMPLE events:          8
                   MMAP2 events:          4
          FINISHED_ROUND events:          1
              THREAD_MAP events:          1
                 CPU_MAP events:          1
               TIME_CONV events:          1
        cycles:u stats:
                  SAMPLE events:          8
        $ perf config report.skip-empty=false
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:         19
                    MMAP events:          0
                    LOST events:          0
                    COMM events:          2
                    EXIT events:          1
                THROTTLE events:          0
              UNTHROTTLE events:          0
                    FORK events:          0
                    READ events:          0
                  SAMPLE events:          8
                   MMAP2 events:          4
                     AUX events:          0
            ITRACE_START events:          0
            LOST_SAMPLES events:          0
                  SWITCH events:          0
         SWITCH_CPU_WIDE events:          0
              NAMESPACES events:          0
                 KSYMBOL events:          0
               BPF_EVENT events:          0
                  CGROUP events:          0
               TEXT_POKE events:          0
                    ATTR events:          0
              EVENT_TYPE events:          0
            TRACING_DATA events:          0
                BUILD_ID events:          0
          FINISHED_ROUND events:          1
                ID_INDEX events:          0
           AUXTRACE_INFO events:          0
                AUXTRACE events:          0
          AUXTRACE_ERROR events:          0
              THREAD_MAP events:          1
                 CPU_MAP events:          1
             STAT_CONFIG events:          0
                    STAT events:          0
              STAT_ROUND events:          0
            EVENT_UPDATE events:          0
               TIME_CONV events:          1
                 FEATURE events:          0
              COMPRESSED events:          0
        cycles:u stats:
                  SAMPLE events:          8
        $ perf config report.skip-empty
        report.skip-empty=false
        $
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f08cf33
    • Namhyung Kim's avatar
      perf report: Add --skip-empty option to suppress 0 event stat · 2775de0b
      Namhyung Kim authored
      
      
      To make the output more readable, I think it's better to remove 0's in
      the output.  Also the dummy event has no event stats so it just wasts
      the space.  Let's use the --skip-empty option to suppress it.
      
        $ perf report --stat --skip-empty
      
        Aggregated stats:
                   TOTAL events:      16530
                    MMAP events:        226
                    COMM events:       1596
                    EXIT events:          2
                THROTTLE events:        121
              UNTHROTTLE events:        117
                    FORK events:       1595
                  SAMPLE events:        719
                   MMAP2 events:      12147
                  CGROUP events:          2
          FINISHED_ROUND events:          2
              THREAD_MAP events:          1
                 CPU_MAP events:          1
               TIME_CONV events:          1
        cycles stats:
                  SAMPLE events:        719
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2775de0b
    • Namhyung Kim's avatar
      perf report: Show event sample counts in --stat output · 55f75444
      Namhyung Kim authored
      
      
      To make the output identical with perf report -D, it needs to show
      per-event sample counts along with the aggregated stat  at the end.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55f75444
    • Namhyung Kim's avatar
      perf hists: Split hists_stats from events_stats · 0f0abbac
      Namhyung Kim authored
      
      
      Each struct hists have events_stats but most of the fields were not
      used.  It's to count number of samples and periods whether filtered or
      not.  And other fields are used only by evlist.
      
      So it'd be better to split hists_stats and events_stats to reduce
      wasted memory in the struct hists.  This makes the output of event
      statistics in the perf report compact by skipping 0 events in each
      evsel/hists.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f0abbac
    • Namhyung Kim's avatar
      perf top: Use evlist->events_stat to count events · bf8f8587
      Namhyung Kim authored
      
      
      It's mainly to count lost events for the warning so it should be ok
      to use the evlist->stats instead.  This is needed for changes in the
      next commit.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bf8f8587