Skip to content
  1. Apr 29, 2021
    • Jin Yao's avatar
      perf parse-events: Create two hybrid raw events · 94da591b
      Jin Yao authored
      
      
      On hybrid platform, same raw event is possible to be available
      on both cpu_core pmu and cpu_atom pmu. It's supported to create
      two raw events for one event encoding. For raw events, the
      attr.type is PMU type.
      
        # perf stat -e r3c -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        r3c: 0: 434449 1001412521 1001412521
        r3c: 1: 173162 1001482031 1001482031
        r3c: 2: 231710 1001524974 1001524974
        r3c: 3: 110012 1001563523 1001563523
        r3c: 4: 191517 1001593221 1001593221
        r3c: 5: 956458 1001628147 1001628147
        r3c: 6: 416969 1001715626 1001715626
        r3c: 7: 1047527 1001596650 1001596650
        r3c: 8: 103877 1001633520 1001633520
        r3c: 9: 70571 1001637898 1001637898
        r3c: 10: 550284 1001714398 1001714398
        r3c: 11: 1257274 1001738349 1001738349
        r3c: 12: 107797 1001801432 1001801432
        r3c: 13: 67471 1001836281 1001836281
        r3c: 14: 286782 1001923161 1001923161
        r3c: 15: 815509 1001952550 1001952550
        r3c: 0: 95994 1002071117 1002071117
        r3c: 1: 105570 1002142438 1002142438
        r3c: 2: 115921 1002189147 1002189147
        r3c: 3: 72747 1002238133 1002238133
        r3c: 4: 103519 1002276753 1002276753
        r3c: 5: 121382 1002315131 1002315131
        r3c: 6: 80298 1002248050 1002248050
        r3c: 7: 466790 1002278221 1002278221
        r3c: 6821369 16026754282 16026754282
        r3c: 1162221 8017758990 8017758990
      
         Performance counter stats for 'system wide':
      
                 6,821,369      cpu_core/r3c/
                 1,162,221      cpu_atom/r3c/
      
               1.002289965 seconds time elapsed
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      94da591b
    • Jin Yao's avatar
      perf parse-events: Create two hybrid cache events · 30def61f
      Jin Yao authored
      
      
      For cache events, they have pre-defined configs. The kernel needs
      to know where the cache event comes from (e.g. from cpu_core pmu
      or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
      can't carry pmu information.
      
      Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
      The PMU type ID is stored at attr.config[63:32].
      
      When enabling a hybrid cache event without specified pmu, such as,
      'perf stat -e LLC-loads -a', two events are created
      automatically. One is for atom, the other is for core.
      
        # perf stat -e LLC-loads -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        LLC-loads: 0: 1507 1001800280 1001800280
        LLC-loads: 1: 666 1001812250 1001812250
        LLC-loads: 2: 3353 1001813453 1001813453
        LLC-loads: 3: 514 1001848795 1001848795
        LLC-loads: 4: 627 1001952832 1001952832
        LLC-loads: 5: 4399 1001451154 1001451154
        LLC-loads: 6: 1240 1001481052 1001481052
        LLC-loads: 7: 478 1001520348 1001520348
        LLC-loads: 8: 691 1001551236 1001551236
        LLC-loads: 9: 310 1001578945 1001578945
        LLC-loads: 10: 1018 1001594354 1001594354
        LLC-loads: 11: 3656 1001622355 1001622355
        LLC-loads: 12: 882 1001661416 1001661416
        LLC-loads: 13: 506 1001693963 1001693963
        LLC-loads: 14: 3547 1001721013 1001721013
        LLC-loads: 15: 1399 1001734818 1001734818
        LLC-loads: 0: 1314 1001793826 1001793826
        LLC-loads: 1: 2857 1001752764 1001752764
        LLC-loads: 2: 646 1001830694 1001830694
        LLC-loads: 3: 1612 1001864861 1001864861
        LLC-loads: 4: 2244 1001912381 1001912381
        LLC-loads: 5: 1255 1001943889 1001943889
        LLC-loads: 6: 4624 1002021109 1002021109
        LLC-loads: 7: 2703 1001959302 1001959302
        LLC-loads: 24793 16026838264 16026838264
        LLC-loads: 17255 8015078826 8015078826
      
         Performance counter stats for 'system wide':
      
                    24,793      cpu_core/LLC-loads/
                    17,255      cpu_atom/LLC-loads/
      
               1.001970988 seconds time elapsed
      
      0x4 in 0x400000002 indicates the cpu_core pmu.
      0x8 in 0x800000002 indicates the cpu_atom pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      30def61f
    • Jin Yao's avatar
      perf parse-events: Create two hybrid hardware events · 9cbfa2f6
      Jin Yao authored
      
      
      Current hardware events has special perf types PERF_TYPE_HARDWARE.
      But it doesn't pass the PMU type in the user interface. For a hybrid
      system, the perf kernel doesn't know which PMU the events belong to.
      
      So now this type is extended to be PMU aware type. The PMU type ID
      is stored at attr.config[63:32].
      
      PMU type ID is retrieved from sysfs.
      
        root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
        8
      
        root@lkp-adl-d01:/sys/devices/cpu_core# cat type
        4
      
      When enabling a hybrid hardware event without specified pmu, such as,
      'perf stat -e cycles -a', two events are created automatically. One
      is for atom, the other is for core.
      
        # perf stat -e cycles -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        cycles: 0: 836272 1001525722 1001525722
        cycles: 1: 628564 1001580453 1001580453
        cycles: 2: 872693 1001605997 1001605997
        cycles: 3: 70417 1001641369 1001641369
        cycles: 4: 88593 1001726722 1001726722
        cycles: 5: 470495 1001752993 1001752993
        cycles: 6: 484733 1001840440 1001840440
        cycles: 7: 1272477 1001593105 1001593105
        cycles: 8: 209185 1001608616 1001608616
        cycles: 9: 204391 1001633962 1001633962
        cycles: 10: 264121 1001661745 1001661745
        cycles: 11: 826104 1001689904 1001689904
        cycles: 12: 89935 1001728861 1001728861
        cycles: 13: 70639 1001756757 1001756757
        cycles: 14: 185266 1001784810 1001784810
        cycles: 15: 171094 1001825466 1001825466
        cycles: 0: 129624 1001854843 1001854843
        cycles: 1: 122533 1001840421 1001840421
        cycles: 2: 90055 1001882506 1001882506
        cycles: 3: 139607 1001896463 1001896463
        cycles: 4: 141791 1001907838 1001907838
        cycles: 5: 530927 1001883880 1001883880
        cycles: 6: 143246 1001852529 1001852529
        cycles: 7: 667769 1001872626 1001872626
        cycles: 6744979 16026956922 16026956922
        cycles: 1965552 8014991106 8014991106
      
         Performance counter stats for 'system wide':
      
                 6,744,979      cpu_core/cycles/
                 1,965,552      cpu_atom/cycles/
      
               1.001882711 seconds time elapsed
      
      0x4 in 0x400000000 indicates the cpu_core pmu.
      0x8 in 0x800000000 indicates the cpu_atom pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9cbfa2f6
    • Jin Yao's avatar
      perf stat: Uniquify hybrid event name · 12279429
      Jin Yao authored
      
      
      It would be useful to let user know the pmu which the event belongs to.
      perf-stat has supported '--no-merge' option and it can print the pmu
      name after the event name, such as:
      
      "cycles [cpu_core]"
      
      Now this option is enabled by default for hybrid platform but change
      the format to:
      
      "cpu_core/cycles/"
      
      If user configs the name, we still use the user specified name.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      12279429
    • Jin Yao's avatar
      perf pmu: Add hybrid helper functions · c5a26ea4
      Jin Yao authored
      
      
      The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
      can be used to identify the hybrid platform and return the found
      hybrid cpu pmu. All the detected hybrid pmus have been saved in
      'perf_pmu__hybrid_pmus' list. So we just need to search this list.
      
      perf_pmu__hybrid_type_to_pmu converts the user specified string
      to hybrid pmu name. This is used to support the '--cputype' option
      in next patches.
      
      perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that,
      we have to define it in pmu.c (make pmu-hybrid.c no more symbol
      dependency), otherwise perf test python would be failed.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5a26ea4
    • Jin Yao's avatar
      perf pmu: Save detected hybrid pmus to a global pmu list · 44462430
      Jin Yao authored
      
      
      We identify the cpu_core pmu and cpu_atom pmu by explicitly
      checking following files:
      
      For cpu_core, checks:
      "/sys/bus/event_source/devices/cpu_core/cpus"
      
      For cpu_atom, checks:
      "/sys/bus/event_source/devices/cpu_atom/cpus"
      
      If the 'cpus' file exists and it has data, the pmu exists.
      
      But in order not to hardcode the "cpu_core" and "cpu_atom",
      and make the code in a generic way.
      
      So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the
      hybrid pmu exists. All the detected hybrid pmus are linked to a global
      list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the
      list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44462430
    • Jin Yao's avatar
      perf pmu: Save pmu name · 32705de7
      Jin Yao authored
      
      
      On hybrid platform, one event is available on one pmu
      (such as, available on cpu_core or on cpu_atom).
      
      This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
      Then next we can know the pmu which the event can be enabled on.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      32705de7
    • Jin Yao's avatar
      perf pmu: Simplify arguments of __perf_pmu__new_alias · eab35953
      Jin Yao authored
      
      
      Simplify the arguments of __perf_pmu__new_alias() by passing the whole
      'struct pme_event' pointer.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eab35953
    • Jin Yao's avatar
      perf jevents: Support unit value "cpu_core" and "cpu_atom" · 6b64833b
      Jin Yao authored
      
      
      For some Intel platforms, such as Alderlake, which is a hybrid platform
      and it consists of atom cpu and core cpu. Each cpu has dedicated event
      list. Part of events are available on core cpu, part of events are
      available on atom cpu.
      
      The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in
      json is added with a new field "Unit" to indicate which pmu the event
      is available on.
      
      For example, one event in cache.json,
      
          {
              "BriefDescription": "Counts the number of load ops retired that",
              "CollectPEBSRecord": "2",
              "Counter": "0,1,2,3",
              "EventCode": "0xd2",
              "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO",
              "PEBScounters": "0,1,2,3",
              "SampleAfterValue": "1000003",
              "UMask": "0x80",
              "Unit": "cpu_atom"
          },
      
      The unit "cpu_atom" indicates this event is only available on "cpu_atom".
      
      In generated pmu-events.c, we can see:
      
      {
              .name = "mem_load_uops_retired_misc.mmio",
              .event = "period=1000003,umask=0x80,event=0xd2",
              .desc = "Counts the number of load ops retired that. Unit: cpu_atom ",
              .topic = "cache",
              .pmu = "cpu_atom",
      },
      
      But if without this patch, the "uncore_" prefix is added before "cpu_atom",
      such as:
              .pmu = "uncore_cpu_atom"
      
      That would be a wrong pmu.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6b64833b
    • Jin Yao's avatar
      tools headers uapi: Update tools's copy of linux/perf_event.h · 41273611
      Jin Yao authored
      To get the changes in:
      
      Liang Kan's patch
      
        55bcf6ef
      
       ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
      
      Kan's patch is in the tip/perf/core branch.
      
      So the next perf tool patches need this interface for hybrid support.
      
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-2-yao.jin@linux.intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41273611
    • Namhyung Kim's avatar
      perf report: Print percentage of each event statistics · 462f57db
      Namhyung Kim authored
      
      
      It's sometimes useful to see how many samples vs other events in the
      data file with percent values.
      
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:      20064
                    MMAP events:        239  ( 1.2%)
                    COMM events:       1518  ( 7.6%)
                    EXIT events:          1  ( 0.0%)
                    FORK events:       1517  ( 7.6%)
                  SAMPLE events:       4015  (20.0%)
                   MMAP2 events:      12769  (63.6%)
          FINISHED_ROUND events:          2  ( 0.0%)
              THREAD_MAP events:          1  ( 0.0%)
                 CPU_MAP events:          1  ( 0.0%)
               TIME_CONV events:          1  ( 0.0%)
        cycles stats:
                  SAMPLE events:       2475
        instructions stats:
                  SAMPLE events:       1540
      
      Suggested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      462f57db
    • Namhyung Kim's avatar
      perf report: Make --skip-empty as default · 8f08cf33
      Namhyung Kim authored
      
      
      so that the compact output is shown by default.  Also add 'report.skip-empty'
      config option to override the default.  Users can also use --no-skip-empty
      command line option to change the behavior anytime.
      
      Committer testing:
      
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:         19
                    COMM events:          2
                    EXIT events:          1
                  SAMPLE events:          8
                   MMAP2 events:          4
          FINISHED_ROUND events:          1
              THREAD_MAP events:          1
                 CPU_MAP events:          1
               TIME_CONV events:          1
        cycles:u stats:
                  SAMPLE events:          8
        $ perf config report.skip-empty=false
        $ perf report --stat
      
        Aggregated stats:
                   TOTAL events:         19
                    MMAP events:          0
                    LOST events:          0
                    COMM events:          2
                    EXIT events:          1
                THROTTLE events:          0
              UNTHROTTLE events:          0
                    FORK events:          0
                    READ events:          0
                  SAMPLE events:          8
                   MMAP2 events:          4
                     AUX events:          0
            ITRACE_START events:          0
            LOST_SAMPLES events:          0
                  SWITCH events:          0
         SWITCH_CPU_WIDE events:          0
              NAMESPACES events:          0
                 KSYMBOL events:          0
               BPF_EVENT events:          0
                  CGROUP events:          0
               TEXT_POKE events:          0
                    ATTR events:          0
              EVENT_TYPE events:          0
            TRACING_DATA events:          0
                BUILD_ID events:          0
          FINISHED_ROUND events:          1
                ID_INDEX events:          0
           AUXTRACE_INFO events:          0
                AUXTRACE events:          0
          AUXTRACE_ERROR events:          0
              THREAD_MAP events:          1
                 CPU_MAP events:          1
             STAT_CONFIG events:          0
                    STAT events:          0
              STAT_ROUND events:          0
            EVENT_UPDATE events:          0
               TIME_CONV events:          1
                 FEATURE events:          0
              COMPRESSED events:          0
        cycles:u stats:
                  SAMPLE events:          8
        $ perf config report.skip-empty
        report.skip-empty=false
        $
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f08cf33
    • Namhyung Kim's avatar
      perf report: Add --skip-empty option to suppress 0 event stat · 2775de0b
      Namhyung Kim authored
      
      
      To make the output more readable, I think it's better to remove 0's in
      the output.  Also the dummy event has no event stats so it just wasts
      the space.  Let's use the --skip-empty option to suppress it.
      
        $ perf report --stat --skip-empty
      
        Aggregated stats:
                   TOTAL events:      16530
                    MMAP events:        226
                    COMM events:       1596
                    EXIT events:          2
                THROTTLE events:        121
              UNTHROTTLE events:        117
                    FORK events:       1595
                  SAMPLE events:        719
                   MMAP2 events:      12147
                  CGROUP events:          2
          FINISHED_ROUND events:          2
              THREAD_MAP events:          1
                 CPU_MAP events:          1
               TIME_CONV events:          1
        cycles stats:
                  SAMPLE events:        719
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2775de0b
    • Namhyung Kim's avatar
      perf report: Show event sample counts in --stat output · 55f75444
      Namhyung Kim authored
      
      
      To make the output identical with perf report -D, it needs to show
      per-event sample counts along with the aggregated stat  at the end.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55f75444
    • Namhyung Kim's avatar
      perf hists: Split hists_stats from events_stats · 0f0abbac
      Namhyung Kim authored
      
      
      Each struct hists have events_stats but most of the fields were not
      used.  It's to count number of samples and periods whether filtered or
      not.  And other fields are used only by evlist.
      
      So it'd be better to split hists_stats and events_stats to reduce
      wasted memory in the struct hists.  This makes the output of event
      statistics in the perf report compact by skipping 0 events in each
      evsel/hists.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f0abbac
    • Namhyung Kim's avatar
      perf top: Use evlist->events_stat to count events · bf8f8587
      Namhyung Kim authored
      
      
      It's mainly to count lost events for the warning so it should be ok
      to use the evlist->stats instead.  This is needed for changes in the
      next commit.
      
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bf8f8587
    • Nicholas Fraser's avatar
      perf data: Add JSON export · d0713d4c
      Nicholas Fraser authored
      
      
      This adds a feature to export perf data to JSON.
      
      The resolved symbols are exported into the JSON so that external tools
      don't need to load the dsos themselves (or even have access to them at
      all.) This makes it easy to load and analyze perf data with standalone
      tools where direct perf or libbabeltrace integration is impractical.
      
      The exporter uses a minimal inline JSON encoding without any external
      dependencies. Currently it only outputs some headers and sample metadata
      but it's easily extensible.
      
      Use it like this:
      
        $ perf data convert --to-json out.json
      
      Committer notes:
      
      Fixup a __printf() bug that broke the build:
      
        util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant
          103 | __(printf, 5, 6)
              |           ^~
              |           )
        util/data-convert-json.c: In function ‘output_sample_callchain_entry’:
        util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration]
          124 |  output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip);
              |  ^~~~~~~~~~~~~~~~~~~~~~
              |  output_json_format
      
      Also had to add this patch to fix errors reported by various versions of
      clang:
      
        -       if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) {
        +       if (al && al->sym && al->sym->namelen) {
      
      al->sym->name is a zero sized array, to avoid one extra alloc in the
      symbol__new() constructor, sym->namelen carries its strlen.
      
      Committer testing:
      
        $ ls -la out.json
        ls: cannot access 'out.json': No such file or directory
        $ perf record sleep 0.1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
        $ perf report --stats | grep -w SAMPLE
                  SAMPLE events:          8
        $ perf data convert --to-json out.json
        [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ]
        [ perf data convert: Converted and wrote 0.002 MB (8 samples) ]
        $ ls -la out.json
        -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json
        $ cat out.json
        {
        	"linux-perf-json-version": 1,
        	"headers": {
        		"header-version": 1,
        		"captured-on": "2021-04-26T20:28:57Z",
        		"data-offset": 432,
        		"data-size": 1016,
        		"feat-offset": 1448,
        		"hostname": "five",
        		"os-release": "5.11.14-200.fc33.x86_64",
        		"arch": "x86_64",
        		"cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor",
        		"cpuid": "AuthenticAMD,23,113,0",
        		"nrcpus-online": 24,
        		"nrcpus-avail": 24,
        		"perf-version": "5.12.gee134f3189bd",
        		"cmdline": [
        			"/home/acme/bin/perf",
        			"record",
        			"sleep",
        			"0.1"
        		]
        	},
        	"samples": [
        		{
        			"timestamp": 170517539043684,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0xffffffffa6268827"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539048443,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0xffffffffa661359d"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539051018,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0xffffffffa6311e18"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539053652,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0x7fdb77b4812b",
        					"symbol": "_dl_start",
        					"dso": "ld-2.32.so"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539055306,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0xffffffffa6269286"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539057590,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0xffffffffa62abd8b"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539067559,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0x7fdb77b5e9e9",
        					"symbol": "__GI___tunables_init",
        					"dso": "ld-2.32.so"
        				}
        			]
        		},
        		{
        			"timestamp": 170517539282452,
        			"pid": 375844,
        			"tid": 375844,
        			"comm": "sleep",
        			"callchain": [
        				{
        					"ip": "0x7fdb779978d2",
        					"symbol": "getenv",
        					"dso": "libc-2.32.so"
        				}
        			]
        		}
        	]
        }
        $
      
      Signed-off-by: default avatarNicholas Fraser <nfraser@codeweavers.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tan Xiaojun <tanxiaojun@huawei.com>
      Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
      Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d0713d4c
    • Song Liu's avatar
      perf stat: Introduce bpf_counter_ops->disable() · 5508c9da
      Song Liu authored
      
      
      Introduce bpf_counter_ops->disable(), which is used stop counting the
      event.
      
      Committer notes:
      
      Added a dummy bpf_counter__disable() to the python binding to avoid
      having 'perf test python' failing.
      
      bpf_counter isn't supported in the python binding.
      
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: kernel-team@fb.com
      Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5508c9da
    • Song Liu's avatar
      perf stat: Introduce ':b' modifier · 01bd8efc
      Song Liu authored
      
      
      Introduce 'b' modifier to event parser, which means use BPF program to
      manage this event. This is the same as --bpf-counters option, but only
      applies to this event. For example,
      
        perf stat -e cycles:b,cs               # use bpf for cycles, but not cs
        perf stat -e cycles,cs --bpf-counters  # use bpf for both cycles and cs
      
      Suggested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01bd8efc
    • Song Liu's avatar
      perf stat: Introduce config stat.bpf-counter-events · 112cb561
      Song Liu authored
      
      
      Currently, to use BPF to aggregate perf event counters, the user uses
      --bpf-counters option. Enable "use bpf by default" events with a config
      option, stat.bpf-counter-events. Events with name in the option will use
      BPF.
      
      This also enables mixed BPF event and regular event in the same sesssion.
      For example:
      
         perf config stat.bpf-counter-events=instructions
         perf stat -e instructions,cs
      
      The second command will use BPF for "instructions" but not "cs".
      
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      112cb561
    • Song Liu's avatar
      perf bpf: check perf_attr_map is compatible with the perf binary · fe3dd826
      Song Liu authored
      
      
      perf_attr_map could be shared among different version of perf binary. Add
      bperf_attr_map_compatible() to check whether the existing attr_map is
      compatible with current perf binary.
      
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: kernel-team@fb.com
      Link: https://lore.kernel.org/r/20210425214333.1090950-3-song@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fe3dd826
    • Song Liu's avatar
      perf util: Move bpf_perf definitions to a libperf header · ec8149fb
      Song Liu authored
      
      
      By following the same protocol, other tools can share hardware PMCs with
      perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to
      bpf_perf.h for other tools to use.
      
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: kernel-team@fb.com
      Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec8149fb
  2. Apr 26, 2021
  3. Apr 25, 2021
  4. Apr 24, 2021