Skip to content
  1. Apr 08, 2015
    • Jiri Olsa's avatar
      perf kmem: Respect -i option · 28939e1a
      Jiri Olsa authored
      
      
      Currently the perf kmem does not respect -i option.
      
      Initializing the file.path properly after options get parsed.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1428298576-9785-2-git-send-email-namhyung@kernel.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      28939e1a
    • Namhyung Kim's avatar
      tools lib traceevent: Honor operator priority · 3201f0dc
      Namhyung Kim authored
      
      
      Currently it ignores operator priority and just sets processed args as a
      right operand.  But it could result in priority inversion in case that
      the right operand is also a operator arg and its priority is lower.
      
      For example, following print format is from new kmem events.
      
        "page=%p", REC->pfn != -1UL ? (((struct page *)(0xffffea0000000000UL)) + (REC->pfn)) : ((void *)0)
      
      But this was treated as below:
      
        REC->pfn != ((null - 1UL) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
      
      In this case, the right arg was '?' operator which has lower priority.
      But it just sets the whole arg so making the output confusing - page was
      always 0 or 1 since that's the result of logical operation.
      
      With this patch, it can handle it properly like following:
      
        ((REC->pfn != (null - 1UL)) ? ((struct page *)0xffffea0000000000UL + REC->pfn) : (void *) 0)
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1428298576-9785-10-git-send-email-namhyung@kernel.org
      [ Replaced 'swap' with 'rotate' in a comment as requested by Steve and agreed by Namhyung ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3201f0dc
    • Wang Nan's avatar
      perf kmaps: Check kmaps to make code more robust · ba92732e
      Wang Nan authored
      
      
      This patch add checks in places where map__kmap is used to get kmaps
      from struct kmap.
      
      Error messages are added at map__kmap to warn invalid accessing of kmap
      (for the case of !map->dso->kernel, kmap(map) does not exists at all).
      
      Also, introduces map__kmaps() to warn uninitialized kmaps.
      
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/1428394966-131044-2-git-send-email-wangnan0@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba92732e
    • He Kuang's avatar
      perf evlist: Fix inverted logic in perf_mmap__empty · 8ea92ceb
      He Kuang authored
      
      
      perf_evlist__mmap_consume() uses perf_mmap__empty() to judge whether
      perf_mmap is empty and can be released. But the result is inverted so
      fix it.
      
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1428399071-7141-1-git-send-email-hekuang@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8ea92ceb
  2. Apr 03, 2015
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 6645f318
      Ingo Molnar authored
      
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Support unnamed union/structure members data collection in 'perf probe'. (Masami Hiramatsu)
      
        - Support missing -f to override perf.data file ownership. (Yunlong Song)
      
      Infrastructure changes:
      
        - No need to lookup thread twice when processing samples in 'perf script'. (Arnaldo Carvalho de Melo)
      
        - No need to pass thread twice to the scripting callbacks. (Arnaldo Carvalho de Melo)
      
        - No need to pass thread twice to the db-export facility. (Arnaldo Carvalho de Melo)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6645f318
    • Yunlong Song's avatar
      perf data: Support using -f to override perf.data file ownership for 'convert' · bd05954b
      Yunlong Song authored
      
      
      Enable perf data convert to use perf.data when it is not owned by
      current user or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 17:35 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf data convert --to-ctf=./ctf-data/
       File perf.data not owned by current user or root (use -f to override)
       # perf data convert --to-ctf=./ctf-data/ -f
         Error: unknown switch `f'
      
        usage: perf data convert [<options>]
      
           -v, --verbose         be more verbose
           -i, --input <file>    input file name
               --to-ctf ...      Convert to CTF format
      
      After this patch:
      
       # perf data convert --to-ctf=./ctf-data/
       File perf.data not owned by current user or root (use -f to override)
       # perf data convert --to-ctf=./ctf-data/ -f
       # ls ctf-data/
       metadata  perf_stream_0
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-11-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bd05954b
    • Yunlong Song's avatar
      perf trace: Support using -f to override perf.data file ownership · e366a6d8
      Yunlong Song authored
      
      
      Enable perf trace to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf trace record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 4153101 Apr  2 15:28 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf trace -i perf.data
       File perf.data not owned by current user or root (use -f to override)
       # perf trace -i perf.data -f
         Error: unknown switch `f'
      
        usage: perf trace [<options>] [<command>]
           or: perf trace [<options>] -- <command> [<options>]
           or: perf trace record [<options>] [<command>]
           or: perf trace record [<options>] -- <command> [<options>]
      
               --event <event>   event selector. use 'perf list' to list
       						  available events
               --comm            show the thread COMM next to its id
               --tool_stats      show tool stats
           -e, --expr <expr>     list of events to trace
           -o, --output <file>   output file name
           -i, --input <file>    Analyze events in file
           -p, --pid <pid>       trace events on existing process id
           -t, --tid <tid>       trace events on existing thread id
               --filter-pids <float>
        ...
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf trace -i perf.data
       File perf.data not owned by current user or root (use -f to override)
       # perf trace -i perf.data -f
       0.056 ( 0.002 ms): ls/47325 brk(                                 ...
       0.108 ( 0.018 ms): ls/47325 mmap(len: 4096, prot: READ|WRITE,    ...
       0.145 ( 0.013 ms): ls/47325 access(filename: 0x7f31259a0eb0,     ...
       0.172 ( 0.008 ms): ls/47325 open(filename: 0x7fffeb9a0d00,       ...
       0.180 ( 0.004 ms): ls/47325 stat(filename: 0x7fffeb9a0d00,       ...
       0.185 ( 0.004 ms): ls/47325 open(filename: 0x7fffeb9a0d00,       ...
       0.189 ( 0.003 ms): ls/47325 stat(filename: 0x7fffeb9a0d00,       ...
       0.195 ( 0.004 ms): ls/47325 open(filename: 0x7fffeb9a0d00,       ...
       0.199 ( 0.002 ms): ls/47325 stat(filename: 0x7fffeb9a0d00,       ...
       0.205 ( 0.004 ms): ls/47325 open(filename: 0x7fffeb9a0d00,       ...
       0.211 ( 0.004 ms): ls/47325 stat(filename: 0x7fffeb9a0d00,       ...
       0.220 ( 0.007 ms): ls/47325 open(filename: 0x7f312599e8ff,       ...
       ...
       ...
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-10-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e366a6d8
    • Yunlong Song's avatar
      perf timechart: Support using -f to override perf.data file ownership · 44f7e432
      Yunlong Song authored
      
      
      Enable perf timechart to use perf.data when it is not owned by current
      user or root.
      
      Example:
      
       # perf timechart record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5471744 Apr  2 15:15 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf timechart
       File perf.data not owned by current user or root (use -f to override)
       # perf timechart -f
         Error: unknown switch `f'
      
        usage: perf timechart [<options>] {record}
      
           -i, --input <file>    input file name
           -o, --output <file>   output file name
           -w, --width <n>       page width
               --highlight <duration or task name>
                                 highlight tasks. Pass duration in ns or process name.
           -P, --power-only      output power data only
           -T, --tasks-only      output processes data only
           -p, --process <process>
                                 process selector. Pass a pid or process name.
               --symfs <directory>
                                 Look for files with symbols relative to this directory
           -n, --proc-num <n>    min. number of tasks to print
           -t, --topology        sort CPUs according to topology
               --io-skip-eagain  skip EAGAIN errors
               --io-min-time <time>
                                 all IO faster than min-time will visually appear longer
               --io-merge-dist <time>
                                 merge events that are merge-dist us apart
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf timechart
       File perf.data not owned by current user or root (use -f to override)
       # perf timechart -f
       Written 0.0 seconds of trace to output.svg.
       # cat output.svg
       <?xml version="1.0" standalone="no"?>
       <!DOCTYPE svg SYSTEM "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
       <svg width="1000" height="10110" version="1.1" xmlns="http://www.w3.org/2000/svg">
       <defs>
         <style type="text/css">
           <![CDATA[
             rect          { stroke-width: 1; }
       ...
       ...
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-9-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44f7e432
    • Yunlong Song's avatar
      perf script: Support using -f to override perf.data file ownership · 06af0f2c
      Yunlong Song authored
      
      
      Enable perf script to use perf.data when it is not owned by current user
      or root. Change the short option name of --fields to -F to avoid confusion
      with --force.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28360 Apr  2 14:53 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf script
       File perf.data not owned by current user or root (use -f to override)
       # perf script -f
         Error: switch `f' requires a value
      
        usage: perf script [<options>]
           or: perf script [<options>] record <script> [<record-options>] <command>
           or: perf script [<options>] report <script> [script-args]
           or: perf script [<options>] <script> [<record-options>] <command>
           or: perf script [<options>] <top-script> [script-args]
      
           -f, --fields <str>    comma separated output fields prepend with
           'type:'. Valid types: hw,sw,trace,raw. Fields:
           comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,period
      
      As shown above, the -f option does not work at all. And -f is already
      taken up by --fields, which makes --force confused, so change the short
      option name of --fields to -F like what other perf commands do (e.g.
      perf report -F) and use -f as the short option name of --force.
      
      After this patch:
      
       # perf script
       File perf.data not owned by current user or root (use -f to override)
       # perf script -f
       :41298 41298 2590086.564226:          1 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564244:          1 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564249:          7 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
       :41298 41298 2590086.564255:        176 cycles:  ffffffff8103efc6
       native_write_msr_safe ([kernel.kallsyms])
           ls 41298 2590086.567346:       4059 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567353:       3717 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567358:      63058 cycles:  ffffffff8105a592
           raise_softirq ([kernel.kallsyms])
           ls 41298 2590086.567448:    1706255 cycles:            406ae0
           [unknown] (/usr/bin/ls)
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-8-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06af0f2c
    • Yunlong Song's avatar
      perf mem: Support using -f to override perf.data file ownership · 62a1a63a
      Yunlong Song authored
      
      
      Enable perf mem to use perf.data when it is not owned by current user or
      root.
      
      Example:
      
       # perf mem -t load record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 16392 Apr  2 14:34 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf mem -D report
       File perf.data not owned by current user or root (use -f to override)
       # perf mem -D -f report
         Error: unknown switch `f'
      
        usage: perf mem [<options>] {record|report}
      
           -t, --type <type>     memory operations(load,store) Default load,store
           -D, --dump-raw-samples
                                 dump raw samples in ASCII
           -U, --hide-unresolved
                                 Only display entries resolved to a symbol
           -i, --input <file>    input file name
           -C, --cpu <cpu>       list of cpus to profile
           -x, --field-separator <separator>
                                 separator for columns, no spaces will be added
                                 between columns '.' is reserved.
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf mem -D report
       File perf.data not owned by current user or root (use -f to override)
       # perf mem -D -f report
       # PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL
       39095 39095 0xffffffff81127e40 0x016ffff887f45148338 8 0x68100142
       /proc/kcore:perf_event_aux
       39095 39095 0xffffffff8100a3fe 0xffff89007f8cb7d0 6 0x68100142
       /proc/kcore:native_sched_clock
       39095 39095 0xffffffff81309139 0xffff88bf44c9ded8 6 0x68100142
       /proc/kcore:acpi_map_lookup
       39095 39095 0xffffffff810f8c4c 0xffff89007f8ccd88 6 0x68100142
       /proc/kcore:rcu_nmi_exit
       39095 39095 0xffffffff81136346 0xffff88fea995dd50 6 0x68100142
       /proc/kcore:unlock_page
       39095 39095 0xffffffff812a64a2 0xffff88fea995dcc8 6 0x68100142
       /proc/kcore:half_md4_transform
       39095 39095 0x7f0cf877c7e9 0x25dfb94 6 0x68100142
       /lib64/libc-2.19.so:__readdir64
       39095 39095 0x7f0cf87575a3 0x7f0cf9163731 6 0x68100142
       /lib64/libc-2.19.so:__strcoll_l
       39095 39095 0xffffffff8116910e 0xffffea01c1bfbd50 23 0x68100242
       /proc/kcore:page_remove_rmap
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-7-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      62a1a63a
    • Yunlong Song's avatar
      perf lock: Support using -f to override perf.data file ownership · c4ac732a
      Yunlong Song authored
      
      
      Enable perf lock to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf lock record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 4880686 Apr  2 14:14 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf lock report
       File perf.data not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf lock report -f
         Error: unknown switch `f'
      
        usage: perf lock report [<options>]
      
           -k, --key <acquired>  key for sorting (acquired / contended /
           avg_wait / wait_total / wait_max / wait_min)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf lock report
       File perf.data not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf lock report -f
                      Name   acquired  contended   avg wait (ns) total wait (ns) ...
      
       &ldata->output_l...        128          0               0               0 ...
                &ctx->lock        114          0               0               0 ...
               &p->pi_lock        112          0               0               0 ...
       &(&pool->lock)->...        112          0               0               0 ...
       &(&dentry->d_loc...         70          0               0               0 ...
       &(&newf->file_lo...         62          0               0               0 ...
       &(&fs->lock)->rl...         43          0               0               0 ...
       ...
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-6-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c4ac732a
    • Yunlong Song's avatar
      perf kvm: Support using -f to override perf.data.guest file ownership · 8cc5ec1f
      Yunlong Song authored
      
      
      Enable perf kvm to use perf.data.guest when it is not owned by current
      user or root.
      
      Example:
      
       # perf kvm stat record ls
       # chown Yunlong.Song:Yunlong.Song perf.data.guest
       # ls -al perf.data.guest
       -rw------- 1 Yunlong.Song Yunlong.Song 4128937 Apr  2 11:05 perf.data.guest
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
         Error: unknown switch `f'
      
        usage: perf kvm stat report [<options>]
      
               --event <report event>
                                 event for reporting: vmexit, mmio (x86 only),
                                 ioport (x86 only)
               --vcpu <n>        vcpu id to report
           -k, --key <sort-key>  key for sorting: sample(sort by samples
       						   number) time (sort by avg time)
           -p, --pid <pid>       analyze events only for given process id(s)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
       Analyze events for all VMs, all VCPUs:
      
         VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
      
       Total Samples:0, Total events handled time:0.00us.
      
      As shown above, the -f option really works now. Since we have not
      launched any KVM related process, the result shows 0 sample here.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-5-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8cc5ec1f
    • Yunlong Song's avatar
      perf kmem: Support using -f to override perf.data file ownership · d1eeb77c
      Yunlong Song authored
      
      
      Enable perf kmem to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf kmem record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5315665 Apr  2 10:54 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
         Error: unknown switch `f'
      
        usage: perf kmem [<options>] {record|stat}
      
           -i, --input <file>    input file name
           -v, --verbose         be more verbose (show symbol address, etc)
               --caller          show per-callsite statistics
               --alloc           show per-allocation statistics
           -s, --sort <key[,key2...]>
                                 sort by keys: ptr, call_site, bytes, hit,
                                 pingpong, frag
           -l, --line <num>      show n lines
               --raw-ip          show raw ip instead of symbol
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
       SUMMARY
       =======
       Total bytes requested: 437599
       Total bytes allocated: 615472
       Total bytes wasted on internal fragmentation: 177873
       Internal fragmentation: 28.900259%
       Cross CPU allocations: 6/1192
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-4-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d1eeb77c
    • Yunlong Song's avatar
      perf inject: Support using -f to override perf.data file ownership · ccaa474c
      Yunlong Song authored
      
      
      Enable perf inject to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:37 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
         Error: unknown switch `f'
      
        usage: perf inject [<options>]
      
           -b, --build-ids       Inject build-ids into the output stream
           -i, --input <file>    input file name
           -o, --output <file>   output file name
           -s, --sched-stat      Merge sched-stat and sched-switch for getting
           events where and how long tasks slept
           -v, --verbose         be more verbose (show build ids, etc)
               --kallsyms <file>
                                 kallsyms pathname
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
       build id event received for [kernel.kallsyms]:
       f6dcb66d8b98f1c0d9eb87bf043444b69f91d30c
       symsrc__init: cannot get elf header.
       Looking at the vmlinux_path (7 entries long)
       Using /proc/kcore for kernel object code
       Using /proc/kallsyms for symbols
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-3-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccaa474c
    • Yunlong Song's avatar
      perf evlist: Support using -f to override perf.data file ownership · 9e3b6ec1
      Yunlong Song authored
      
      
      Enable perf evlist to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:18 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
         Error: unknown switch `f'
      
        usage: perf evlist [<options>]
      
           -i, --input <file>    Input file name
           -F, --freq            Show the sample frequency
           -v, --verbose         Show all event attr details
           -g, --group           Show event group information
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
       cycles
      
      As shown above, the -f option really works now.
      
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-2-git-send-email-yunlong.song@huawei.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9e3b6ec1
    • Masami Hiramatsu's avatar
      perf probe: Fix to track down unnamed union/structure members · c7273835
      Masami Hiramatsu authored
      
      
      Fix 'perf probe' to track down unnamed union/structure members.
      
      perf probe did not track down the tree of unnamed union/structure
      members, since it just failed to find given "name" in a parent
      structure/union.  To solve this issue, I've introduced 2 changes.
      
      - Fix die_find_member() to track down the type-DIE if it is
        unnamed, and if it contains the specified member, returns the
        unnamed member.
        (note that we don't return found member, since unnamed member
         has the offset in the parent structure)
      - Fix convert_variable_fields() to track down the unnamed union/
        structure (one-by-one).
      
      With this patch, perf probe can access unnamed fields:
        -----
        #./perf probe -nfx ./perf lock__delete ops 'locked_ops=ops->locked.ops'
        Added new event:
          probe_perf:lock__delete (on lock__delete in /home/mhiramat/ksrc/linux-3/tools/perf/perf with ops locked_ops=ops->locked.ops)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe_perf:lock__delete -aR sleep 1
        -----
      
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Report-Link: https://lkml.org/lkml/2015/3/5/431
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150402073312.14482.37942.stgit@localhost.localdomain
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c7273835
    • Arnaldo Carvalho de Melo's avatar
      perf db-export: No need to have ->thread twice in struct export_sample · b83e868d
      Arnaldo Carvalho de Melo authored
      
      
      As it comes from address_location->thread, that is already stored as
      export_sample->al, where the thread can be obtained.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20150402141542.GA9630@kernel.org
      Link: http://lkml.kernel.org/n/tip-bzotbl4epoztw0jd6sm2stpf@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b83e868d
    • Arnaldo Carvalho de Melo's avatar
      perf db-export: No need to pass thread twice to db_export__sample · 7327259d
      Arnaldo Carvalho de Melo authored
      
      
      As it is available via another parameter, address_location->thread.
      
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: lkml.kernel.org/r/551D08F8.3040706@intel.com
      Link: http://lkml.kernel.org/n/tip-6dbn0tcm9hyv92g7h3zj2dbt@git.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7327259d
    • Arnaldo Carvalho de Melo's avatar
      perf scripting: No need to pass thread twice to the scripting callbacks · f9d5d549
      Arnaldo Carvalho de Melo authored
      
      
      It is already in the addr_location, so remove the redundant 'thread'
      parameter from the callback signatures.
      
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-3-git-send-email-acme@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9d5d549
    • Arnaldo Carvalho de Melo's avatar
      perf script: No need to lookup thread twice · 79628f2c
      Arnaldo Carvalho de Melo authored
      
      
      We get the thread when we call perf_event__preprocess_sample(), no need
      to do it before that.
      
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-2-git-send-email-acme@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      79628f2c
  3. Apr 02, 2015
    • Ingo Molnar's avatar
      perf/x86/intel/pt: Fix the 32-bit build · 2e54a5bd
      Ingo Molnar authored
      
      
      On a 32-bit build I got:
      
        arch/x86/kernel/cpu/perf_event_intel_pt.c:413:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
        arch/x86/kernel/cpu/perf_event_intel_bts.c:162:24: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      
      Fix it. The code should probably be (re-)tested on 32-bit systems to make
      sure all is fine.
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: markus.t.metzger@intel.com
      Cc: mathieu.poirier@linaro.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2e54a5bd
    • Andi Kleen's avatar
      perf/x86/intel: Avoid rewriting DEBUGCTL with the same value for LBRs · cd1f11de
      Andi Kleen authored
      
      
      perf with LBRs on has a tendency to rewrite the DEBUGCTL MSR with
      the same value. Add a little optimization to skip the unnecessary
      write.
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1426871484-21285-2-git-send-email-andi@firstfloor.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cd1f11de
    • Andi Kleen's avatar
      perf/x86/intel: Streamline LBR MSR handling in PMI · 1a78d937
      Andi Kleen authored
      
      
      The perf PMI currently does unnecessary MSR accesses when
      LBRs are enabled. We use LBR freezing, or when in callstack
      mode force the LBRs to only filter on ring 3.
      
      So there is no need to disable the LBRs explicitely in the
      PMI handler.
      
      Also we always unnecessarily rewrite LBR_SELECT in the LBR
      handler, even though it can never change.
      
       5)               |  /* write_msr: MSR_LBR_SELECT(1c8), value 0 */
       5)               |  /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
       5)               |  /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
       5)               |  /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 70000000f */
       5)               |  /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 */
       5)               |  /* write_msr: MSR_LBR_SELECT(1c8), value 0 */
       5)               |  /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
       5)               |  /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
      
      This patch:
      
        - Avoids disabling already frozen LBRs unnecessarily in the PMI
        - Avoids changing LBR_SELECT in the PMI
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1426871484-21285-1-git-send-email-andi@firstfloor.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1a78d937
    • Andi Kleen's avatar
      perf/x86: Only dump PEBS register when PEBS has been detected · 15fde110
      Andi Kleen authored
      
      
      Technically PEBS_ENABLED is only guaranteed to exist when we
      detected PEBS. So add a check for this to the PMU dump function.
      I don't think it can happen on a real CPU, but could in a VM.
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1425059312-18217-4-git-send-email-andi@firstfloor.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      15fde110
    • Andi Kleen's avatar
      perf/x86: Dump DEBUGCTL in PMU dump · da3e606d
      Andi Kleen authored
      
      
      LBRs and LBR freezing are controlled through the DEBUGCTL MSR. So
      dump the state of DEBUGCTL too when dumping the PMU state.
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1425059312-18217-3-git-send-email-andi@firstfloor.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      da3e606d
    • Andi Kleen's avatar
      perf/x86/intel: Reset more state in PMU reset · 8882edf7
      Andi Kleen authored
      
      
      The PMU reset code didn't quite keep up with newer PMU features.
      Improve it a bit to really reset a modern PMU:
      
        - Clear all overflow status
        - Clear LBRs and freezing state
        - Disable fixed counters too
      
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1425059312-18217-2-git-send-email-andi@firstfloor.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8882edf7
    • Stephane Eranian's avatar
      perf/x86/intel: Make the HT bug workaround conditional on HT enabled · b37609c3
      Stephane Eranian authored
      
      
      This patch disables the PMU HT bug when Hyperthreading (HT)
      is disabled. We cannot do this test immediately when perf_events
      is initialized. We need to wait until the topology information
      is setup properly. As such, we register a later initcall, check
      the topology and potentially disable the workaround. To do this,
      we need to ensure there is no user of the PMU. At this point of
      the boot, the only user is the NMI watchdog, thus we disable
      it during the switch and re-enable it right after.
      
      Having the workaround disabled when it is not needed provides
      some benefits by limiting the overhead is time and space.
      The workaround still ensures correct scheduling of the corrupting
      memory events (0xd0, 0xd1, 0xd2) when HT is off. Those events
      can only be measured on counters 0-3. Something else the current
      kernel did not handle correctly.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: maria.n.dimakopoulou@gmail.com
      Link: http://lkml.kernel.org/r/1416251225-17721-13-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b37609c3
    • Stephane Eranian's avatar
      watchdog: Add watchdog enable/disable all functions · b3738d29
      Stephane Eranian authored
      
      
      This patch adds two new functions to enable/disable
      the watchdog across all CPUs.
      
      This will be used by the HT PMU bug workaround code to
      disable/enable the NMI watchdog across quirk enablement.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: maria.n.dimakopoulou@gmail.com
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1416251225-17721-12-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b3738d29
    • Stephane Eranian's avatar
      perf/x86/intel: Limit to half counters when the HT workaround is enabled, to... · c02cdbf6
      Stephane Eranian authored
      
      perf/x86/intel: Limit to half counters when the HT workaround is enabled, to avoid exclusive mode starvation
      
      This patch limits the number of counters available to each CPU when
      the HT bug workaround is enabled.
      
      This is necessary to avoid situation of counter starvation. Such can
      arise from configuration where one HT thread, HT0, is using all 4 counters
      with corrupting events which require exclusion the the sibling HT, HT1.
      
      In such case, HT1 would not be able to schedule any event until HT0
      is done. To mitigate this problem, this patch artificially limits
      the number of counters to 2.
      
      That way, we can gurantee that at least 2 counters are not in exclusive
      mode and therefore allow the sibling thread to schedule events of the
      same type (system vs. per-thread). The 2 counters are not determined
      in advance. We simply set the limit to two events per HT.
      
      This helps mitigate starvation in case of events with specific counter
      constraints such a PREC_DIST.
      
      Note that this does not elimintate the starvation is all cases. But
      it is better than not having it.
      
      (Solution suggested by Peter Zjilstra.)
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: maria.n.dimakopoulou@gmail.com
      Link: http://lkml.kernel.org/r/1416251225-17721-11-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c02cdbf6
    • Stephane Eranian's avatar
      perf/x86/intel: Fix intel_get_event_constraints() for dynamic constraints · a90738c2
      Stephane Eranian authored
      
      
      With dynamic constraint, we need to restart from the static
      constraints each time the intel_get_event_constraints() is called.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-10-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a90738c2
    • Maria Dimakopoulou's avatar
      perf/x86/intel: Enforce HT bug workaround with PEBS for SNB/IVB/HSW · b63b4b45
      Maria Dimakopoulou authored
      
      
      This patch modifies the PEBS constraint tables for SNB/IVB/HSW
      such that corrupting events supporting PEBS activate the HT
      workaround.
      
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-9-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b63b4b45
    • Maria Dimakopoulou's avatar
      perf/x86/intel: Enforce HT bug workaround for SNB/IVB/HSW · 93fcf72c
      Maria Dimakopoulou authored
      
      
      This patches activates the HT bug workaround for the
      SNB/IVB/HSW processors. This covers non-PEBS mode.
      Activation is done thru the constraint tables.
      
      Both client and server processors needs this workaround.
      
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-8-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      93fcf72c
    • Maria Dimakopoulou's avatar
      perf/x86/intel: Implement cross-HT corruption bug workaround · e979121b
      Maria Dimakopoulou authored
      
      
      This patch implements a software workaround for a HW erratum
      on Intel SandyBridge, IvyBridge and Haswell processors
      with Hyperthreading enabled. The errata are documented for
      each processor in their respective specification update
      documents:
      
        - SandyBridge: BJ122
        - IvyBridge: BV98
        - Haswell: HSD29
      
      The bug causes silent counter corruption across hyperthreads only
      when measuring certain memory events (0xd0, 0xd1, 0xd2, 0xd3).
      Counters measuring those events may leak counts to the sibling
      counter. For instance, counter 0, thread 0 measuring event 0xd0,
      may leak to counter 0, thread 1, regardless of the event measured
      there. The size of the leak is not predictible. It all depends on
      the workload and the state of each sibling hyper-thread. The
      corrupting events do undercount as a consequence of the leak. The
      leak is compensated automatically only when the sibling counter measures
      the exact same corrupting event AND the workload is on the two threads
      is the same. Given, there is no way to guarantee this, a work-around
      is necessary. Furthermore, there is a serious problem if the leaked count
      is added to a low-occurrence event. In that case the corruption on
      the low occurrence event can be very large, e.g., orders of magnitude.
      
      There is no HW or FW workaround for this problem.
      
      The bug is very easy to reproduce on a loaded system.
      Here is an example on a Haswell client, where CPU0, CPU4
      are siblings. We load the CPUs with a simple triad app
      streaming large floating-point vector. We use 0x81d0
      corrupting event (MEM_UOPS_RETIRED:ALL_LOADS) and
      0x20cc (ROB_MISC_EVENTS:LBR_INSERTS). Given we are not
      using the LBR, the 0x20cc event should be zero.
      
        $ taskset -c 0 triad &
        $ taskset -c 4 triad &
        $ perf stat -a -C 0 -e r81d0 sleep 100 &
        $ perf stat -a -C 4 -r20cc sleep 10
        Performance counter stats for 'system wide':
              139 277 291      r20cc
             10,000969126 seconds time elapsed
      
      In this example, 0x81d0 and r20cc ar eusing sinling counters
      on CPU0 and CPU4. 0x81d0 leaks into 0x20cc and corrupts it
      from 0 to 139 millions occurrences.
      
      This patch provides a software workaround to this problem by modifying the
      way events are scheduled onto counters by the kernel. The patch forces
      cross-thread mutual exclusion between counters in case a corrupting event
      is measured by one of the hyper-threads. If thread 0, counter 0 is measuring
      event 0xd0, then nothing can be measured on counter 0, thread 1. If no corrupting
      event is measured on any hyper-thread, event scheduling proceeds as before.
      
      The same example run with the workaround enabled, yield the correct answer:
      
        $ taskset -c 0 triad &
        $ taskset -c 4 triad &
        $ perf stat -a -C 0 -e r81d0 sleep 100 &
        $ perf stat -a -C 4 -r20cc sleep 10
        Performance counter stats for 'system wide':
              0 r20cc
             10,000969126 seconds time elapsed
      
      The patch does provide correctness for all non-corrupting events. It does not
      "repatriate" the leaked counts back to the leaking counter. This is planned
      for a second patch series. This patch series makes this repatriation more
      easy by guaranteeing the sibling counter is not measuring any useful event.
      
      The patch introduces dynamic constraints for events. That means that events which
      did not have constraints, i.e., could be measured on any counters, may now be
      constrained to a subset of the counters depending on what is going on the sibling
      thread. The algorithm is similar to a cache coherency protocol. We call it XSU
      in reference to Exclusive, Shared, Unused, the 3 possible states of a PMU
      counter.
      
      As a consequence of the workaround, users may see an increased amount of event
      multiplexing, even in situtations where there are fewer events than counters
      measured on a CPU.
      
      Patch has been tested on all three impacted processors. Note that when
      HT is off, there is no corruption. However, the workaround is still enabled,
      yet not costing too much. Adding a dynamic detection of HT on turned out to
      be complex are requiring too much to code to be justified.
      
      This patch addresses the issue when PEBS is not used. A subsequent patch
      fixes the problem when PEBS is used.
      
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      [spinlock_t -> raw_spinlock_t]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-7-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e979121b
    • Maria Dimakopoulou's avatar
      perf/x86/intel: Add cross-HT counter exclusion infrastructure · 6f6539ca
      Maria Dimakopoulou authored
      
      
      This patch adds a new shared_regs style structure to the
      per-cpu x86 state (cpuc). It is used to coordinate access
      between counters which must be used with exclusion across
      HyperThreads on Intel processors. This new struct is not
      needed on each PMU, thus is is allocated on demand.
      
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      [peterz: spinlock_t -> raw_spinlock_t]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-6-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f6539ca
    • Stephane Eranian's avatar
      perf/x86: Add 'index' param to get_event_constraint() callback · 79cba822
      Stephane Eranian authored
      
      
      This patch adds an index parameter to the get_event_constraint()
      x86_pmu callback. It is expected to represent the index of the
      event in the cpuc->event_list[] array. When the callback is used
      for fake_cpuc (evnet validation), then the index must be -1. The
      motivation for passing the index is to use it to index into another
      cpuc array.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: maria.n.dimakopoulou@gmail.com
      Link: http://lkml.kernel.org/r/1416251225-17721-5-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      79cba822
    • Maria Dimakopoulou's avatar
      perf/x86: Add 3 new scheduling callbacks · c5362c0c
      Maria Dimakopoulou authored
      
      
      This patch adds 3 new PMU model specific callbacks
      during the event scheduling done by x86_schedule_events().
      
        ->start_scheduling():  invoked when entering the schedule routine.
        ->stop_scheduling():   invoked at the end of the schedule routine
        ->commit_scheduling(): invoked for each committed event
      
      To be used optionally by model-specific code.
      
      Signed-off-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-4-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c5362c0c
    • Stephane Eranian's avatar
      perf/x86: Vectorize cpuc->kfree_on_online · 90413464
      Stephane Eranian authored
      
      
      Make the cpuc->kfree_on_online a vector to accommodate
      more than one entry and add the second entry to be
      used by a later patch.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1416251225-17721-3-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      90413464
    • Stephane Eranian's avatar
      perf/x86: Rename x86_pmu::er_flags to 'flags' · 9a5e3fb5
      Stephane Eranian authored
      
      
      Because it will be used for more than just tracking the
      presence of extra registers.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: bp@alien8.de
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: maria.n.dimakopoulou@gmail.com
      Link: http://lkml.kernel.org/r/1416251225-17721-2-git-send-email-eranian@google.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9a5e3fb5
    • Ingo Molnar's avatar
    • Alexander Shishkin's avatar
      perf/x86/intel/bts: Add BTS PMU driver · 8062382c
      Alexander Shishkin authored
      
      
      Add support for Branch Trace Store (BTS) via kernel perf event infrastructure.
      The difference with the existing implementation of BTS support is that this
      one is a separate PMU that exports events' trace buffers to userspace by means
      of AUX area of the perf buffer, which is zero-copy mapped into userspace.
      
      The immediate benefit is that the buffer size can be much bigger, resulting in
      fewer interrupts and no kernel side copying is involved and little to no trace
      data loss. Also, kernel code can be traced with this driver.
      
      The old way of collecting BTS traces still works.
      
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kaixu Xia <kaixu.xia@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@infradead.org
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: markus.t.metzger@intel.com
      Cc: mathieu.poirier@linaro.org
      Link: http://lkml.kernel.org/r/1422614435-114702-1-git-send-email-alexander.shishkin@linux.intel.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8062382c