Skip to content
  1. Oct 06, 2015
    • Jiri Olsa's avatar
      perf tools: Introduce 'P' modifier to request max precision · 7f94af7a
      Jiri Olsa authored
      
      
      The 'P' will cause the event to get maximum possible detected precise
      level.
      
      Following record:
        $ perf record -e cycles:P ...
      
      will detect maximum precise level for 'cycles' event and use it.
      
      Commiter note:
      
      Testing it:
      
        $ perf record -e cycles:P usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
        $ perf evlist
        cycles:P
        $ perf evlist -v
        cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
        IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
        enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
        comm_exec: 1
        $
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7f94af7a
    • Jiri Olsa's avatar
      perf tools: Export perf_event_attr__set_max_precise_ip() · 45cf6c33
      Jiri Olsa authored
      
      
      It'll be used in following patch.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444068369-20978-5-git-send-email-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      45cf6c33
    • Jiri Olsa's avatar
      perf annotate: Fix sizeof_sym_hist overflow issue · 5ec4502d
      Jiri Olsa authored
      
      
      The annotated_source::sizeof_sym_hist could easily overflow int size,
      resulting in crash in __symbol__inc_addr_samples.
      
      Changing its type int size_t as was probably intended from beginning
      based on the initialization code in symbol__alloc_hist.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444068369-20978-4-git-send-email-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ec4502d
    • Jiri Olsa's avatar
      perf evlist: Display DATA_SRC sample type bit · 84422592
      Jiri Olsa authored
      
      
      Adding DATA_SRC bit_name call to display sample_type properly.
      
         $ perf evlist -v
         cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC, ...
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444068369-20978-3-git-send-email-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84422592
    • Jiri Olsa's avatar
      tools lib api fs: No need to use PATH_MAX + 1 · ccb5597f
      Jiri Olsa authored
      
      
      Because there's no point, PATH_MAX is big enough.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444068369-20978-2-git-send-email-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccb5597f
  2. Oct 03, 2015
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · e3b0ac1b
      Ingo Molnar authored
      
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
       - Do event name substring search as last resort in 'perf list'.
         (Arnaldo Carvalho de Melo)
      
         E.g.:
      
          # perf list clock
      
          List of pre-defined events (to be used in -e):
      
           cpu-clock                                          [Software event]
           task-clock                                         [Software event]
      
           uncore_cbox_0/clockticks/                          [Kernel PMU event]
           uncore_cbox_1/clockticks/                          [Kernel PMU event]
      
           kvm:kvm_pvclock_update                             [Tracepoint event]
           kvm:kvm_update_master_clock                        [Tracepoint event]
           power:clock_disable                                [Tracepoint event]
           power:clock_enable                                 [Tracepoint event]
           power:clock_set_rate                               [Tracepoint event]
           syscalls:sys_enter_clock_adjtime                   [Tracepoint event]
           syscalls:sys_enter_clock_getres                    [Tracepoint event]
           syscalls:sys_enter_clock_gettime                   [Tracepoint event]
           syscalls:sys_enter_clock_nanosleep                 [Tracepoint event]
           syscalls:sys_enter_clock_settime                   [Tracepoint event]
           syscalls:sys_exit_clock_adjtime                    [Tracepoint event]
           syscalls:sys_exit_clock_getres                     [Tracepoint event]
           syscalls:sys_exit_clock_gettime                    [Tracepoint event]
           syscalls:sys_exit_clock_nanosleep                  [Tracepoint event]
           syscalls:sys_exit_clock_settime                    [Tracepoint event]
      
       - Reduce min 'perf stat --interval-print/-I' to 10ms. (Kan Liang)
      
         perf stat --interval in action:
      
         # perf stat -e cycles -I 50 -a usleep $((200 * 1000))
         print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution.
         #   time                    counts unit events
            0.050233636         48,240,396      cycles
            0.100557098         35,492,594      cycles
            0.150804687         39,295,112      cycles
            0.201032269         33,101,961      cycles
            0.201980732            786,379      cycles
        #
      
       - Allow for max_stack greater than PERF_MAX_STACK_DEPTH, as when
         synthesizing callchains from Intel PT data. (Adrian Hunter)
      
       - Allow probing on kmodules without DWARF. (Masami Hiramatsu)
      
       - Fix a segfault when processing a perf.data file with callchains using
         "perf report --call-graph none". (Namhyung Kim)
      
       - Fix unresolved COMMs in 'perf top' when -s comm is used. (Namhyung Kim)
      
       - Register idle thread in 'perf top'. (Namhyung Kim)
      
       - Change 'record.samples' type to unsigned long long, fixing output of
         number of samples in 32-bit architectures. (Yang Shi)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e3b0ac1b
    • Kan Liang's avatar
      perf stat: Reduce min --interval-print to 10ms · 19afd104
      Kan Liang authored
      
      
      The --interval-print parameter was limited to 100ms. However, for
      example, 10ms is required to do sophisticated bandwidth analysis using
      uncore events.
      
      The test shows that the overhead of the system-wide uncore monitoring
      with 10ms interval is only ~2%. So this patch reduces the minimal
      interval-print allowd to 10ms.
      
      But 10ms may not work well for all cases. For example, when the
      cpus/threads number is very large, for system-wide core event monitoring
      the overhead could be high.
      
      To handle this issue, a warning will be displayed when the
      interval-print is set between 10ms to 100ms. So users can make a
      decision according to their specific cases.
      
       # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
      
       print interval < 100ms. The overhead percentage could be high in some
       cases. Please proceed with caution.
       #           time             counts unit events
            0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
            0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
            0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
            0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
            0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/
      
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
      [ Added warning about overhead when using sub 100ms intervals to the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19afd104
    • Yang Shi's avatar
      perf record: Change 'record.samples' type to unsigned long long · 9f065194
      Yang Shi authored
      
      
      When run "perf record -e", the number of samples showed up is wrong on some
      32 bit systems, i.e. powerpc and arm.
      
      For example, run the below commands on 32 bit powerpc:
      
        perf probe -x /lib/libc.so.6 malloc
        perf record -e probe_libc:malloc -a ls perf.data
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]
      
      Actually, "perf script" just shows 21 samples. The number of samples is also
      absurd since samples is long type, but it is printed as PRIu64.
      
      Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
      
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
      [ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f065194
    • Masami Hiramatsu's avatar
      perf probe: Allow probing on kmodules without dwarf · 1a8ac29c
      Masami Hiramatsu authored
      
      
      Allow probing on kernel modules when 'perf' is built without debuginfo
      support.
      
      Currently perf-probe --module requires linking with libdw, but this
      doesn't make sense.
      
      E.g.
        ----
        # make NO_DWARF=1
        # ./perf probe -m pcspkr pcspkr_event%return
          Error: unknown switch `m'
        ----
      
      With this patch
        ----
        # ./perf probe -m pcspkr pcspkr_event%return
        Added new event:
          probe:pcspkr_event   (on pcspkr_event%return in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event -aR sleep 1
        ----
      
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20151002125832.18617.78721.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1a8ac29c
    • Arnaldo Carvalho de Melo's avatar
      perf list: Honour 'event_glob' whem printing selectable PMUs · fa52ceab
      Arnaldo Carvalho de Melo authored
      
      
      Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
      
      	$ perf record -e intel_bts:// usleep 1
      
      Is a valid event name.
      
      But the code printing such PMUs was not honouring the 'event_glob'
      parameter, so the following line was always appearing:
      
        $ intel_bts//                                        [Kernel PMU event]
      
      Fix it:
      
        $ [acme@felicio linux]$ perf list data
      
        List of pre-defined events (to be used in -e):
      
          uncore_imc/data_reads/                             [Kernel PMU event]
          uncore_imc/data_writes/                            [Kernel PMU event]
      
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa52ceab
  3. Oct 01, 2015
    • Arnaldo Carvalho de Melo's avatar
      perf list: Do event name substring search as last resort when no events found · dbc67409
      Arnaldo Carvalho de Melo authored
      
      
      Before:
      
        # perf list _alloc_ | head -10
        #
      
      After:
      
        # perf list _alloc_ | head -10
          ext4:ext4_alloc_da_blocks                          [Tracepoint event]
          ext4:ext4_get_implied_cluster_alloc_exit           [Tracepoint event]
          kmem:kmem_cache_alloc_node                         [Tracepoint event]
          kmem:mm_page_alloc_extfrag                         [Tracepoint event]
          kmem:mm_page_alloc_zone_locked                     [Tracepoint event]
          xen:xen_mmu_alloc_ptpage                           [Tracepoint event]
        #
      
      And it works for all types of events:
      
        # perf list br
      
        List of pre-defined events (to be used in -e):
      
          branch-instructions OR branches                    [Hardware event]
          branch-misses                                      [Hardware event]
      
          branch-load-misses                                 [Hardware cache event]
          branch-loads                                       [Hardware cache event]
      
          branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
          branch-misses OR cpu/branch-misses/                [Kernel PMU event]
      
          filelock:break_lease_block                         [Tracepoint event]
          filelock:break_lease_noblock                       [Tracepoint event]
          filelock:break_lease_unblock                       [Tracepoint event]
          syscalls:sys_enter_brk                             [Tracepoint event]
          syscalls:sys_exit_brk                              [Tracepoint event]
      
        #
      
      Suggested-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-qieivl18jdemoaghgndj36e6@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dbc67409
    • Adrian Hunter's avatar
      perf callchain: Allow for max_stack greater than PERF_MAX_STACK_DEPTH · 0edd4533
      Adrian Hunter authored
      
      
      Adjust the validation to allow for max_stack greater than
      PERF_MAX_STACK_DEPTH.
      
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-18-git-send-email-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0edd4533
    • Namhyung Kim's avatar
      perf report: Fix a bug on "--call-graph none" option · 208e7607
      Namhyung Kim authored
      The patch f9db0d0f
      
       ("perf callchain: Allow disabling call graphs
      per event") added an ability to enable/disable callchain recording per
      event.  But it had a problem when the enablement setting is changed at
      'perf report' time using -g/--call-graph option.
      
      For example, the following scenario will get a segfault.
      
        $ perf record -ag sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.500 MB perf.data (2555 samples) ]
      
        $ perf report -g none
        perf: Segmentation fault
        -------- backtrace --------
        perf[0x53a98a]
        /usr/lib/libc.so.6(+0x335af)[0x7f4e91df95af]
      
      This is because callchain_param.sort() callback was not set but it
      tried to call the function as it had the PERF_SAMPLE_CALLCHAIN bit.
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: f9db0d0f
      
       ("perf callchain: Allow disabling call graphs per event")
      Link: http://lkml.kernel.org/r/1443587640-24242-1-git-send-email-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      208e7607
    • Namhyung Kim's avatar
      perf top: Register idle thread · c53d138d
      Namhyung Kim authored
      
      
      The perf top didn't add the idle/swapper thread to the machine's thread
      list and its comm was displayed as ':0'.  Fix it.
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1443577526-3240-3-git-send-email-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c53d138d
    • Namhyung Kim's avatar
      perf top: Fix unresolved comm when -s comm is used · 4b37af59
      Namhyung Kim authored
      
      
      The perf top uses 'dso,symbol' sort keys by default so it overlooked a
      problem in task's comm resolving.  When the sort key contains 'comm',
      some task's comm is not shown properly.  This is because the
      perf_top__mmap_read_idx() checks the cpumode value improperly.
      
      The cpumode value of non-sample events are 0 (PERF_RECORD_MISC_CPUMODE_
      UNKNOWN) so the events will be ignored by the switch statement.  This patch
      allows it for non-sample events.
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1443577526-3240-2-git-send-email-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4b37af59
    • Namhyung Kim's avatar
      perf record: Allocate area for sample_id_hdr in a synthesized comm event · e5bed564
      Namhyung Kim authored
      
      
      A previous patch added a synthesized comm event for forked child process
      but it missed that the event should contain area for sample_id_hdr at
      the end.  It worked by accident since the perf_event union contains
      bigger event structs like mmap_events.
      
      This patch fixes it by dynamically allocating event struct including
      those area like in perf_event__synthesize_thread_map().
      
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1443577526-3240-1-git-send-email-namhyung@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5bed564
    • Vaishali Thakkar's avatar
      perf/x86/intel/uncore: Do not use macro DEFINE_PCI_DEVICE_TABLE() · c2365b93
      Vaishali Thakkar authored
      
      
      The DEFINE_PCI_DEVICE_TABLE() macro is deprecated. Use
      'struct pci_device_id' instead of DEFINE_PCI_DEVICE_TABLE(),
      with the goal of getting rid of this macro completely.
      
      This Coccinelle semantic patch performs this transformation:
      
      	@@
      	identifier a;
      	declarer name DEFINE_PCI_DEVICE_TABLE;
      	initializer i;
      	@@
      	- DEFINE_PCI_DEVICE_TABLE(a)
      	+ const struct pci_device_id a[] = i;
      
      Signed-off-by: default avatarVaishali Thakkar <vthakkar1994@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20151001085201.GA16939@localhost
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c2365b93
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 4bc6a58f
      Ingo Molnar authored
      
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes:
      
      User visible changes:
      
        - By default use the most precise "cycles" hw counter available, i.e.
          when the user doesn't specify any event, it will try using cycles:ppp,
          cycles:pp, etc. (Arnaldo Carvalho de Melo)
      
        - Remove blank lines, headers when piping output in 'perf list', so that it can
          be sanely used with 'wc -l', etc. (Arnaldo Carvalho de Melo)
      
        - Amend documentation about max_stack and synthesized callchains. (Adrian Hunter)
      
        - Fix 'perf probe -l' for probes added to kernel module functions. (Masami Hiramatsu)
      
      Build fixes:
      
        - Fix shadowed declarations that break the build on older distros. (Jiri Olsa)
      
        - Fix build break on powerpc due to sample_reg_masks. (Sukadev Bhattiprolu)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4bc6a58f
    • Arnaldo Carvalho de Melo's avatar
      perf tools: By default use the most precise "cycles" hw counter available · 7f8d1ade
      Arnaldo Carvalho de Melo authored
      
      
      If the user doesn't specify any event, try the most precise "cycles"
      available, i.e. start by "cycles:ppp" and go on removing "p" till it
      works.
      
      E.g.
      
        $ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ]
        $ perf evlist
        cycles:pp
        $ perf evlist -v
        cycles:pp: size: 112, { sample_period, sample_freq }: 4000, sample_type:
        IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
        enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1,
        exclude_guest: 1, mmap2: 1, comm_exec: 1
        $ grep 'model name' /proc/cpuinfo | head -1
        model name	: Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
        $
      
      When 'cycles' appears explicitely is specified this will not be tried,
      i.e. the user has full control of the level of precision to be used:
      
        $ perf record -e cycles usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.016 MB perf.data (9 samples) ]
        $ perf evlist
        cycles
        $ perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
        IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
        enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
        1, comm_exec: 1
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://www.youtube.com/watch?v=nXaxk27zwlk
      Link: http://lkml.kernel.org/n/tip-b1ywebmt22pi78vjxau01wth@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7f8d1ade
    • Arnaldo Carvalho de Melo's avatar
      perf list: Remove blank lines, headers when piping output · dfc431cb
      Arnaldo Carvalho de Melo authored
      
      
      So that one can, for instance, use it with wc -l:
      
        # perf list *:*write* | wc -l
        60
      
      Or to look for the "bio" tracepoints, without 'perf list' headers:
      
        # perf list *:*bio* | head
          block:block_bio_backmerge                          [Tracepoint event]
          block:block_bio_bounce                             [Tracepoint event]
          block:block_bio_complete                           [Tracepoint event]
          block:block_bio_frontmerge                         [Tracepoint event]
          block:block_bio_queue                              [Tracepoint event]
          block:block_bio_remap                              [Tracepoint event]
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ts7sc0x8u4io4cifzkup4j44@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dfc431cb
    • Masami Hiramatsu's avatar
      perf probe: Improve error message when %return is on inlined function · 6cca13bd
      Masami Hiramatsu authored
      
      
      perf probe shows more precisely message when it finds given
      %return target function is inlined.
      
      Without this fix:
        ----
        # ./perf probe -V getname_flags%return
        Return probe must be on the head of a real function.
        Debuginfo analysis failed.
          Error: Failed to show vars.
        ----
      
      With this fix:
        ----
        # ./perf probe -V getname_flags%return
        Failed to find "getname_flags%return",
         because getname_flags is an inlined function and has no return point.
        Debuginfo analysis failed.
          Error: Failed to show vars.
        ----
      
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164137.3733.55055.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6cca13bd
    • Masami Hiramatsu's avatar
      perf probe: Fix a segfault bug in debuginfo_cache · 20f49859
      Masami Hiramatsu authored
      
      
      perf probe --list will get a segfault if the first kprobe event is on a
      module and the second or latter one is on the kernel.
      
      e.g.
        ----
        # ./perf probe -q -m pcspkr pcspkr_event
        # ./perf probe -q vfs_read
        # ./perf probe -l
        Segmentation fault (core dumped)
        ----
      
      This is because the debuginfo_cache fails to handle NULL module name,
      which causes segfault on strcmp. (Note that strcmp("something", NULL)
      always causes segfault)
      
      To fix this debuginfo_cache__open always translates the NULL module name
      to "kernel" (this is correct, because NULL module name means opening the
      debuginfo for the kernel)
      
        ----
        # ./perf probe -l
          probe:pcspkr_event   (on pcspkr_event@drivers/input/misc/pcspkr.c
          in pcspkr)
          probe:vfs_read       (on vfs_read@ksrc/linux-3/fs/read_write.c)
        ----
      
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164135.3733.23993.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      20f49859
    • Masami Hiramatsu's avatar
      perf probe: Show correct source lines of probes on kmodules · 9b239a12
      Masami Hiramatsu authored
      
      
      Perf probe always failed to find appropriate line numbers because of
      failing to find .text start address offset from debuginfo.
      
      e.g.
        ----
        # ./perf probe -m pcspkr pcspkr_event:5
        Added new events:
          probe:pcspkr_event   (on pcspkr_event:5 in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event:5 in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event_1 -aR sleep 1
      
        # ./perf probe -l
        Failed to find debug information for address ffffffffa031f006
        Failed to find debug information for address ffffffffa031f016
          probe:pcspkr_event   (on pcspkr_event+6 in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event+22 in pcspkr)
        ----
      
      This fixes the above issue as below.
      1. Get the relative address of the symbol in .text by using
         map->start.
      2. Adjust the address by adding the offset of .text section
         in the kernel module binary.
      
      With this fix, perf probe -l shows lines correctly.
        ----
        # ./perf probe -l
          probe:pcspkr_event   (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
          probe:pcspkr_event_1 (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
        ----
      
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164132.3733.24643.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9b239a12
    • Masami Hiramatsu's avatar
      perf probe: Begin and end libdwfl report session correctly · 9135949d
      Masami Hiramatsu authored
      
      
      Fix a trival bug about libdwfl usage of the report session, it should
      explicitly begin and end a report session around dwfl_report_offline().
      
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164128.3733.59876.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9135949d
    • Masami Hiramatsu's avatar
      perf probe: Fix to remove dot suffix from second or latter events · 663b1151
      Masami Hiramatsu authored
      Fix to remove dot suffix (e.g. .const, .isra) from the second or latter
      events which has suffix numbers.
      
      Since the previous commit 35a23ff9
      
       ("perf probe: Cut off the gcc
      optimization postfixes from function name") didn't care about the suffix
      numbered events, therefore we'll have an error when we add additional
      events on the same dot suffix functions.
      
      e.g.
        ----
        # ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
         -a get_sigframe.isra.2.constprop.3
        Failed to write event: Invalid argument
          Error: Failed to add events.
        ----
      
      This fixes above issue as below:
        ----
        # ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
         -a get_sigframe.isra.2.constprop.3
        Added new events:
          probe:get_sigframe   (on get_sigframe.isra.2.constprop.3)
          probe:get_sigframe_1 (on get_sigframe.isra.2.constprop.3)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:get_sigframe_1 -aR sleep 1
      
        ----
      
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20150930164130.3733.26573.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      663b1151
    • Arnaldo Carvalho de Melo's avatar
      tools lib symbol: Introduce kallsyms2elf_type · f845086a
      Arnaldo Carvalho de Melo authored
      
      
      Map 't', 'T' (text, local, global), 'w' and 'W' (weak text, local,
      global) as STT_FUNC, and the rest as STT_OBJECT
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-sbwcixulpc5v1xuxn3xvm0nn@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f845086a
    • Arnaldo Carvalho de Melo's avatar
      tools lib symbol: Rename kallsyms2elf_type to kallsyms2elf_binding · 8e947f1e
      Arnaldo Carvalho de Melo authored
      
      
      It is about binding, not type, we have just a letter in kallsyms that
      should map both for the ELF type (STT_FUNC, etc) and to the ELF
      symbol binding (STB_WEAK, STB_GLOBAL, etc), so rename it now before
      introducing kallsyms2_elf_type()
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-uu5vj343ms1q2wm55690on6v@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e947f1e
    • Arnaldo Carvalho de Melo's avatar
      perf machine: Add method for common kernel_map(FUNCTION) operation · a5e813c6
      Arnaldo Carvalho de Melo authored
      
      
      And it is also a step in the direction of killing the separation of data
      and text maps in map_groups.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-rrds86kb3wx5wk8v38v56gw8@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a5e813c6
    • Arnaldo Carvalho de Melo's avatar
      perf machine: Use machine__kernel_map() thoroughly · 77e65977
      Arnaldo Carvalho de Melo authored
      
      
      In places where we were using its open coded equivalent.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-khkdugcdoqy3tkszm3jdxgbe@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      77e65977
    • Sukadev Bhattiprolu's avatar
      perf tools: Fix build break on powerpc due to sample_reg_masks · eb56db54
      Sukadev Bhattiprolu authored
      
      
      The perf_regs.c file does not get built on Powerpc as CONFIG_PERF_REGS
      is false.  So the weak definition for 'sample_regs_masks' doesn't get
      picked up.
      
      Adding perf_regs.o to util/Build unconditionally, exposes a redefinition
      error for 'perf_reg_value()' function (due to the static inline version
      in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that
      function.
      
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linuxppc-dev@ozlabs.org
      Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb56db54
    • Adrian Hunter's avatar
      perf report: Amend documentation about max_stack and synthesized callchains · 40862a7b
      Adrian Hunter authored
      
      
      The --max_stack option was added as an optimization to reduce processing time,
      so people specifying --max-stack might get a increased processing time if
      combined with synthesized callchains, but otherwise no real harm.
      
      A warning about setting both --max_stack and the synthesized callchains max
      depth seems like overkill.  Amend the documentation.
      
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/560A5155.4060105@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40862a7b
    • Arnaldo Carvalho de Melo's avatar
      perf maps: Introduce maps__find_symbol_by_name() · b7f9ff56
      Arnaldo Carvalho de Melo authored
      
      
      Out of map_groups__find_symbol_by_name(), so that we can turn this later
      one first into a call to maps__find_symbol_by_name(MAP__FUNCTION) +
      MAP__VARIABLE, and then to just one call, we'll merge MAP__FUNCTION with
      MAP__VARIABLE maps, to simplify the code.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-pvkar0jacqn92g148u9sqttt@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b7f9ff56
    • Jiri Olsa's avatar
      perf tools: Fix shadowed declaration in parse-events.c · 272ed29a
      Jiri Olsa authored
      
      
      The error variable breaks build on CentOS 6.7, due to a collision with a
      global error symbol:
      
          CC       util/parse-events.o
        cc1: warnings being treated as errors
        util/parse-events.c:419: error: declaration of ‘error’ shadows a global
        declaration
        util/util.h:135: error: shadowed declaration is here
        util/parse-events.c: In function ‘add_tracepoint_multi_event’:
        ...
      
      Using different argument names instead to fix it.
      
      Reported-by: default avatarVinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: linux-tip-commits@vger.kernel.org
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150929150531.GI27383@krava.redhat.com
      [ Fix one more case, at line 770 ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      272ed29a
    • Jiri Olsa's avatar
      tools: Fix shadowed declaration in err.h · 45633a16
      Jiri Olsa authored
      
      
      The error variable breaks build on CentOS 6.7, due to collision with
      global error symbol:
      
          CC       util/evlist.o
        cc1: warnings being treated as errors
        In file included from util/evlist.c:28:
        tools/include/linux/err.h: In function ‘ERR_PTR’:
        tools/include/linux/err.h:34: error: declaration of ‘error’ shadows a global declaration
        util/util.h:135: error: shadowed declaration is here
      
      Using 'error_' name instead to fix it.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/n/tip-i9mdgdbrgauy3fe76s9rd125@git.kernel.org
      Reported-by: default avatarVinson Lee <vlee@twopensource.com>
      [ Use 'error_' instead of 'err' to, visually, not diverge too much from include/linux/err.h ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      45633a16
  4. Sep 29, 2015
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 9c17dbc6
      Ingo Molnar authored
      
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        - Accept a zero --itrace period, meaning "as often as possible".  In the case
          of Intel PT that is the same as a period of 1 and a unit of 'instructions'
          (i.e.  --itrace=i1i). (Adrian Hunter)
      
        - Harmonize itrace's synthesized callchains with the existing --max-stack
          tool option. (Adrian Hunter)
      
        - Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter)
      
        - Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter)
      
        - Slighly improve Intel PT debug logging. (Adrian Hunter)
      
        - Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST.
          (Adrian Hunter)
      
        - Further document export-to-postgresql.py script. (Adrian Hunter)
      
        - Add option to synthesize branch stack from auxtrace data. (Adrian Hunter)
      
        - Use equivalent logic to avoid using dso->kernel. (Arnaldo Carvalho de Melo)
      
        - Show proper error messages when parsing bad terms for hw/sw events. (He Kuang)
      
        - Tracepoint event parsing improvements. (He Kuang)
      
        - Store tracing mountpoint for better error message. (Jiri Olsa)
      
        - Add fixdep to tools/build, bringing it closer to the kernel counterpart, from
          where it is being lifted. (Jiri Olsa)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c17dbc6
    • He Kuang's avatar
      perf tools: Enable event_config terms to tracepoint events · e637d177
      He Kuang authored
      
      
      This patch enables config terms for tracepoint perf events. Valid terms
      for tracepoint events are 'call-graph' and 'stack-size', so we can use
      different callgraph settings for each event and eliminate unnecessary
      overhead.
      
      Here is an example for using different call-graph config for each
      tracepoint.
      
        $ perf record -e syscalls:sys_enter_write/call-graph=fp/
                      -e syscalls:sys_exit_write/call-graph=no/
                      dd if=/dev/zero of=test bs=4k count=10
      
        $ perf report --stdio
      
        #
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'syscalls:sys_enter_write'
        # Event count (approx.): 13
        #
        # Children      Self  Command  Shared Object       Symbol
        # ........  ........  .......  ..................  ......................
        #
            76.92%    76.92%  dd       libpthread-2.20.so  [.] __write_nocancel
                         |
                         ---__write_nocancel
      
            23.08%    23.08%  dd       libc-2.20.so        [.] write
                         |
                         ---write
                            |
                            |--33.33%-- 0x2031342820736574
                            |
                            |--33.33%-- 0xa6e69207364726f
                            |
                             --33.33%-- 0x34202c7320393039
        ...
      
        # Samples: 13  of event 'syscalls:sys_exit_write'
        # Event count (approx.): 13
        #
        # Children      Self  Command  Shared Object       Symbol
        # ........  ........  .......  ..................  ......................
        #
            76.92%    76.92%  dd       libpthread-2.20.so  [.] __write_nocancel
            23.08%    23.08%  dd       libc-2.20.so        [.] write
             7.69%     0.00%  dd       [unknown]           [.] 0x0a6e69207364726f
             7.69%     0.00%  dd       [unknown]           [.] 0x2031342820736574
             7.69%     0.00%  dd       [unknown]           [.] 0x34202c7320393039
      
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-4-git-send-email-hekuang@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e637d177
    • He Kuang's avatar
      perf tools: Adds the tracepoint name parsing support · 865582c3
      He Kuang authored
      
      
      Adds rules for parsing tracepoint names. Change rules of tracepoint which
      derives from PE_NAMEs into tracepoint names directly, so adding more rules
      based on tracepoint names will be easier.
      
      Changes v2-v3:
         - Change __event_legacy_tracepoint label in bison file to tracepoint_name
         - Fix formats error.
      
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-3-git-send-email-hekuang@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      865582c3
    • He Kuang's avatar
      perf tools: Show proper error message for wrong terms of hw/sw events · ffeb883e
      He Kuang authored
      
      
      Show proper error message and show valid terms when wrong config terms
      is specified for hw/sw type perf events.
      
      This patch makes the original error format function formats_error_string()
      more generic, which only outputs the static config terms for hw/sw perf
      events, and prepends pmu formats for pmu events.
      
      Before this patch:
      
        $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
        invalid or unsupported event: 'cpu-clock/freqx=200/'
        Run 'perf list' for a list of valid events
      
         usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      After this patch:
      
        $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
        event syntax error: 'cpu-clock/freqx=200/'
                                       \___ unknown term
      
        valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
      
        Run 'perf list' for a list of valid events
      
         usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffeb883e
    • He Kuang's avatar
      perf tools: Adds the config_term callback for different type events · 0b8891a8
      He Kuang authored
      
      
      Currently, function config_term() is used for checking config terms of
      all types of events, while unknown terms is not reported as an error
      because pmu events have valid terms in sysfs.
      
      But this is wrong when unknown terms are specificed to hw/sw events.
      This patch Adds the config_term callback so we can use separate check
      routines for each type of events.
      
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-1-git-send-email-hekuang@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b8891a8
    • Adrian Hunter's avatar
      perf intel-pt: Add mispred-all config option to aid use with autofdo · ba11ba65
      Adrian Hunter authored
      
      
      autofdo incorrectly expects branch flags to include either mispred or
      predicted.  In fact mispred = predicted = 0 is valid and means the flags
      are not supported, which they aren't by Intel PT.
      
      To make autofdo work, add a config option which will cause Intel PT
      decoder to set the mispred flag on all branches.
      
      Below is an example of using Intel PT with autofdo.  The example is
      also added to the Intel PT documentation.  It requires autofdo
      (https://github.com/google/autofdo) and gcc version 5.  The bubble
      sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial)
      amended to take the number of elements as a parameter.
      
      	$ gcc-5 -O3 sort.c -o sort_optimized
      	$ ./sort_optimized 30000
      	Bubble sorting array of 30000 elements
      	2254 ms
      
      	$ cat ~/.perfconfig
      	[intel-pt]
      		mispred-all
      
      	$ perf record -e intel_pt//u ./sort 3000
      	Bubble sorting array of 3000 elements
      	58 ms
      	[ perf record: Woken up 2 times to write data ]
      	[ perf record: Captured and wrote 3.939 MB perf.data ]
      	$ perf inject -i perf.data -o inj --itrace=i100usle --strip
      	$ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
      	$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
      	$ ./sort_autofdo 30000
      	Bubble sorting array of 30000 elements
      	2155 ms
      
      Note there is currently no advantage to using Intel PT instead of LBR,
      but that may change in the future if greater use is made of the data.
      
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba11ba65