Skip to content
  1. Oct 20, 2015
    • Ingo Molnar's avatar
      perf bench: Rename 'mem-memcpy.c' => 'mem-functions.c' · 9b2fa7f3
      Ingo Molnar authored
      
      
      So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew
      memset() functionality and now I plan to add string copy benchmarks as
      well.
      
      This makes the file name a misnomer: rename it to the more generic
      mem-functions.c name.
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-5-git-send-email-mingo@kernel.org
      
      
      [ The "rename" was introducing __unused, wasn't removing the old file,
        and didn't update tools/perf/bench/Build, fix it ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9b2fa7f3
    • Ingo Molnar's avatar
      perf bench: Eliminate unused argument from bench_mem_common() · 2946f59a
      Ingo Molnar authored
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-4-git-send-email-mingo@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2946f59a
    • Ingo Molnar's avatar
      perf bench: Default to all routines in 'perf bench mem' · 27619741
      Ingo Molnar authored
      
      
      So few people know that the --routine option to 'perf bench memcpy/memset'
      exists, and would not know that it's capable of testing the kernel's
      memcpy/memset implementations.
      
      Furthermore, 'perf bench mem all' will not run all routines:
      
      	vega:~> perf bench mem all
      	# Running mem/memcpy benchmark...
      	Routine default (Default memcpy() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	     894.454383 MB/Sec
      	       3.844734 GB/Sec (with prefault)
      
      	# Running mem/memset benchmark...
      	Routine default (Default memset() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       1.220703 GB/Sec
      	       9.042245 GB/Sec (with prefault)
      
      Because misleadingly the 'all' refers to 'all sub-benchmarks', not 'all
      sub-benchmarks and routines'.
      
      Fix all this by making the memcpy/memset routine to default to 'all',
      which results in all the benchmarks being run:
      
      	triton:~> perf bench mem all
      	# Running mem/memcpy benchmark...
      	Routine default (Default memcpy() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       1.448906 GB/Sec
      	       4.957170 GB/Sec (with prefault)
      	Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.614153 GB/Sec
      	       4.379204 GB/Sec (with prefault)
      	Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.570036 GB/Sec
      	       4.264465 GB/Sec (with prefault)
      	Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
      	# Copying 1MB Bytes ...
      
      	       1.788576 GB/Sec
      	       6.554111 GB/Sec (with prefault)
      
      	# Running mem/memset benchmark...
      	Routine default (Default memset() provided by glibc)
      	# Copying 1MB Bytes ...
      
      	       2.082223 GB/Sec
      	       9.126752 GB/Sec (with prefault)
      	Routine x86-64-unrolled (unrolled memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       5.710892 GB/Sec
      	       8.346688 GB/Sec (with prefault)
      	Routine x86-64-stosq (movsq-based memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       9.765625 GB/Sec
      	      12.520032 GB/Sec (with prefault)
      	Routine x86-64-stosb (movsb-based memset() in arch/x86/lib/memset_64.S)
      	# Copying 1MB Bytes ...
      
      	       9.668936 GB/Sec
      	      12.682630 GB/Sec (with prefault)
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-3-git-send-email-mingo@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      27619741
    • Ingo Molnar's avatar
      perf bench: Improve the 'perf bench mem memcpy' code readability · 13839ec4
      Ingo Molnar authored
      
      
       - improve the readability of initializations
       - fix unnecessary double negations
       - fix ugly line breaks
       - fix other small details
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1445241870-24854-2-git-send-email-mingo@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13839ec4
    • Namhyung Kim's avatar
      perf test: Suppress libtraceevent warnings · 2690c730
      Namhyung Kim authored
      
      
      Currently libtraceevent emits warning on unsupported event formats.
      However it'd be better to see them only -v option is given.  To do that,
      it needs to override the warning() function which is used in the
      libtracevent.  Thus add set_warning_routine() same as set_die_routine()
      and check the verbose flag in our warning routine.
      
      Before:
        # perf test 5
         5: parse events tests                                       :
          Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
          Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
          Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
          ...
         Ok
      
      After:
        # perf test 5
         5: parse events tests                                       : Ok
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1445268229-1601-2-git-send-email-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2690c730
    • Namhyung Kim's avatar
      perf test: Silence tracepoint event failures · 87191383
      Namhyung Kim authored
      
      
      Currently, when 'perf test' is run by a normal user, it'll fail to
      access tracepoint events.  The output becomes somewhat messy because it
      tries to be nice with long error messages and hints.
      
      IMHO this is not needed for 'perf test' by default and AFAIK 'perf test'
      uses pr_debug() rather than pr_err() for such messages so that one can
      use -v option to see further details on failed testcases if needed.
      
      Before:
        $ perf test
         1: vmlinux symtab matches kallsyms                          : FAILED!
         2: detect openat syscall event                              :Error:
        No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
        FAILED!
         3: detect openat syscall event on all cpus                  :Error:
        No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
        FAILED!
         ...
      
      After:
        $ perf test
         1: vmlinux symtab matches kallsyms                          : FAILED!
         2: detect openat syscall event                              : FAILED!
         3: detect openat syscall event on all cpus                  : FAILED!
         ...
      
        $ perf test -v 2
         2: detect openat syscall event                              :
        --- start ---
        test child forked, pid 30575
        Error:	    No permissions to read
        /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
        Hint:  Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
      
        test child finished with -1
        ---- end ----
        detect openat syscall event: FAILED!
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1445268229-1601-1-git-send-email-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      87191383
  2. Oct 14, 2015
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · e9363dee
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Use the alternative with the most descriptive filename containing
          a vmlinux file for a given build-id, providing a better title line
          for tools such as 'annotate'. (Arnaldo Carvalho de Melo)
      
        - Remove help messages about previous right and left arrow keybidings, that
          were repurposed for horizontal scrolling. (Arnaldo Carvalho de Melo)
      
        - Inform how to reset the symbol filter in the hists browser. (top & report)
          (Arnaldo Carvalho de Melo)
      
        - Add 'm' key for context menu display in the hists browser, that became
          inacessible with the repurposing of the right arrow key for horizontal
          scrolling. (Namhyung Kim)
      
        - Use debug_frame for callchains if eh_frame is unusable. (Rabin Vicent)
      
      Build fixes:
      
        - Fix strict-aliasing breakage with gcc 4.4 in the READ_ONCE/WRITE_ONCE code
          adopted from the kernel tree, that builds with -fno-strict-aliasing while
          tools/perf/ uses -Wstrict-aliasing=3. (Jiri Olsa)
      
        - Fix unw_word_t pointer casts in code using libunwind for callchains,
          fixing the build in at least 32-bit MIPS systems. (Rabin Vicent)
      
        - Work around cross compile build problems related to fixdep. (Jiri Olsa)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e9363dee
  3. Oct 13, 2015
  4. Oct 08, 2015
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 0e537fef
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Adding a field via 'perf report -F' that already is enabled makes
          the tool get stuck in a loop, fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Support PERF_RECORD_SWITCH in the python binding. (Arnaldo Carvalho de Melo)
      
        - Fix handling read() result using a signed variable, found with Coccinelle.
          (Andrzej Hajda)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e537fef
    • Ingo Molnar's avatar
    • Arnaldo Carvalho de Melo's avatar
      perf python: Support the PERF_RECORD_SWITCH event · ae938802
      Arnaldo Carvalho de Melo authored
      To test it check tools/perf/python/twatch.py, after following the
      instructions there to enable context_switch, output looks like:
      
        [root@zoo linux]# tools/perf/python/twatch.py
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
        cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 0 }
        cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 1 }
        cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 0 }
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 1 }
        cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 1 }
        cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
        ^CTraceback (most recent call last):
          File "tools/perf/python/twatch.py", line 67, in <module>
            main(context_switch = 1, thread = 31463)
          File "tools/perf/python/twatch.py", line 40, in main
            evlist.poll(timeout = -1)
        KeyboardInterrupt
        [root@zoo linux]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Guy Streeter <streeter@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-1ukistmpamc5z717k80ctcp2@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ae938802
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo' of... · 00e6fa5f
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/urgent
      
      Pull perf/urgent fix from Arnaldo Carvalho de Melo:
      
        - Fix build break on (at least) powerpc due to sample_reg_masks, not being
          available for linking. (Sukadev Bhattiprolu)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      00e6fa5f
  5. Oct 07, 2015
  6. Oct 06, 2015
    • Taku Izumi's avatar
      perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore · 712df65c
      Taku Izumi authored
      
      
      In multi-segment system, uncore devices may belong to buses whose segment
      number is other than 0:
      
        ....
        0000:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:7f:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:bf:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
        ...
        0001:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03
        ...
      
      In that case, relation of bus number and physical id may be broken
      because "uncore_pcibus_to_physid" doesn't take account of PCI segment.
      For example, bus 0000:ff and 0001:ff uses the same entry of
      "uncore_pcibus_to_physid" array.
      
      This patch fixes this problem by introducing the segment-aware pci2phy_map instead.
      
      Signed-off-by: default avatarTaku Izumi <izumi.taku@jp.fujitsu.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1443096621-4119-1-git-send-email-izumi.taku@jp.fujitsu.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      712df65c
    • Kan Liang's avatar
      perf/x86: Add Intel cstate PMUs support · 7ce1346a
      Kan Liang authored
      
      
      This patch adds new PMUs to support cstate related free running
      (read-only) counters. These counters may be used simultaneously by other
      tools, such as turbostat. However, it still make sense to implement them
      in perf. Because we can conveniently collect them together with other
      events, and allow to use them from tools without special MSR access
      code.
      
      These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
      According to counters' scope and category, two PMUs are registered with
      the perf_event core subsystem.
      
       - 'cstate_core': The counter is available for each physical core. The
                        counters include CORE_C*_RESIDENCY.
      
       - 'cstate_pkg':  The counter is available for each physical package. The
                        counters include PKG_C*_RESIDENCY.
      
      The events are exposed in sysfs for use by perf stat and other tools.
      The files are:
      
        /sys/devices/cstate_core/events/c*-residency
        /sys/devices/cstate_pkg/events/c*-residency
      
      These events only support system-wide mode counting.
      The /sys/devices/cstate_*/cpumask file can be used by tools to figure
      out which CPUs to monitor by default.
      
      The PMU type (attr->type) is dynamically allocated and is available from
      /sys/devices/core_misc/type and /sys/device/cstate_*/type.
      
      Sampling is not supported.
      
      Here is an example.
      
       - To caculate the fraction of time when the core is running in C6 state
         CORE_C6_time% = CORE_C6_RESIDENCY / TSC
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5
      
         11838820015,,cstate_core/c6-residency/,5175919658,100.00
         11877130740,,msr/tsc/,5175922010,100.00
      
       For sleep, 99.7% of time we ran in C6 state.
      
       # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop
      
         1253316,,cstate_core/c6-residency/,4360969154,100.00
         10012635248,,msr/tsc/,4360972366,100.00
      
       For busyloop, 0.01% of time we ran in C6 state.
      
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7ce1346a
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · f6702681
      Linus Torvalds authored
      Pull xen bug fixes from David Vrabel:
      
       - Fix VM save performance regression with x86 PV guests
      
       - Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)
      
       - Other minor fixes.
      
      * tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen/p2m: hint at the last populated P2M entry
        x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
        x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
        xen/x86: Don't try to write syscall-related MSRs for PV guests
        xen: use correct type for HYPERVISOR_memory_op()
      f6702681
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3ec20e2e
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "Three bug fixes and an update to the default configuration"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/defconfig: set SCSI_DH=y
        s390/vtime: correct scaled cputime of partially idle CPUs
        s390/boot/decompression: disable floating point in decompressor
        s390/numa: use correct type for node_to_cpumask_map
      3ec20e2e
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · 3c68319b
      Linus Torvalds authored
      Pull CIFS fixes from Steve French:
       "Two fixes for problems pointed out by automated tools.
      
        Thanks PaX/grsecurity team and Dan Carpenter (and the Smatch tool)"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        [CIFS] Update cifs version number
        [SMB3] Do not fall back to SMBWriteX in set_file_size error cases
        [SMB3] Missing null tcon check
      3c68319b
    • David Vrabel's avatar
      x86/xen/p2m: hint at the last populated P2M entry · 98dd166e
      David Vrabel authored
      With commit 633d6f17
      
       (x86/xen: prepare
      p2m list for memory hotplug) the P2M may be sized to accomdate a much
      larger amount of memory than the domain currently has.
      
      When saving a domain, the toolstack must scan all the P2M looking for
      populated pages.  This results in a performance regression due to the
      unnecessary scanning.
      
      Instead of reporting (via shared_info) the maximum possible size of
      the P2M, hint at the last PFN which might be populated.  This hint is
      increased as new leaves are added to the P2M (in the expectation that
      they will be used for populated entries).
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: <stable@vger.kernel.org> # 4.0+
      98dd166e
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 1c748dc2
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Switch the default callchain output mode to 'graph,0.5,caller', to make it
          look like the default for other tools, reducing the learning curve for
          people used to 'caller' based viewing. (Arnaldo Carvalho de Melo)
      
        - Implement column based horizontal scrolling in the hists browser (top, report),
          making it possible to use the TUI for things like 'perf mem report' where
          there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo)
      
        - Support sorting by symbol_iaddr with perf.data files produced by
          'perf mem record'. (Don Zickus)
      
        - Display DATA_SRC sample type bit, i.e. when running 'perf evlist -v' the
          "DATA_SRC" wasn't appearing when set, fix it to look like: (Jiri Olsa)
      
            cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC
      
        - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.:
      
           $ perf record -e cycles:P usleep 1
      
          Is now similar to:
      
           $ perf record usleep 1
      
          Useful, for instance, when specifying multiple events. (Jiri Olsa)
      
        - Make 'perf -v' and 'perf -h' work. (Jiri Olsa)
      
        - Fail properly when pattern matching fails to find a tracepoint, i.e.
          '-e non:existent' was being correctly handled, with a proper error message
          about that not being a valid event, but '-e non:existent*' wasn't,
          fix it. (Jiri Olsa)
      
      Infrastructure changes:
      
        - Separate arch specific entries in 'perf test' and add an 'Intel CQM' one
          to be fun on x86 only. (Matt Fleming)
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1c748dc2
    • Jiri Olsa's avatar
      perf tools: Fail properly in case pattern matching fails to find tracepoint · 27bf90bf
      Jiri Olsa authored
      
      
      Currently we dont fail properly when pattern matching fails to find any
      tracepoint.
      
      Current behaviour:
      
        $ perf record -e 'sched:krava*' sleep 1
        WARNING: event parser found nothinginvalid or unsupported event: 'sched:krava*'
        Run 'perf list' for a list of valid events
      
        usage: perf record [<options>] [<command>]
           or: perf record [<options>] -- <command> [<options>]
      
      This patch change:
      
        $ perf record -e 'sched:krava*' sleep 1
        event syntax error: 'sched:krava*'
                             \___ unknown tracepoint
      
        Error:  File /sys/kernel/debug/tracing/events/sched/krava* not found.
        Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
      
        Run 'perf list' for a list of valid events
      
         usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
      Reported-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1444073477-3181-1-git-send-email-jolsa@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      27bf90bf
    • Arnaldo Carvalho de Melo's avatar
      perf hists browser: Implement horizontal scrolling · c6c3c02d
      Arnaldo Carvalho de Melo authored
      Do it using the recently introduced ui_brower scrolling mode, setting
      ui_browser.columns to the number of sort columns and then, when
      rendering each line, skipping as many initial columns as the user
      pressed the right arrow.
      
      As the user presses the left arrow, the ui_browser code will remove the
      scrolling counter and the left scrolling takes place.
      
      The right arrow key was an alias for ENTER, so people used to press it
      may get a bit annoyed at first, sorry! Ditto for ESC and the left key.
      
      Callchains can be left as is or we can, when rendering the Symbol
      column, store the at what position on the screen it is and then
      using ui_browser__gotorc() to print it from there, i.e. the callchain
      would move around with the symbol.
      
      Leaving it as is, i.e. at a fixed position, close to the left, saves
      precious screen real state for it, so I'm inclined to leave it as is
      now.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ccqq9sabgfge5dwbqjwh71ij@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c6c3c02d
    • Arnaldo Carvalho de Melo's avatar
      perf ui browser: Optional horizontal scrolling key binding · faae6f69
      Arnaldo Carvalho de Melo authored
      If the classes derived from ui_browser want to do some sort of
      horizontal scrolling, they have just to set ui_browser->columns to
      the number of columns available.
      
      Those columns can be the number of characters on the screen, if what is
      desired is to scroll character by character, or the number of columns in
      a spreadsheet like table.
      
      This is what the hist_browser will do, skipping ui_browser->horiz_scroll
      columns when rendering each of its lines.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-q6a22bpmpgcr1awgzrmd4jrs@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      faae6f69
    • Arnaldo Carvalho de Melo's avatar
      perf callchain: Switch default to 'graph,0.5,caller' · def02db0
      Arnaldo Carvalho de Melo authored
      
      
      Which is the most common default found in other similar tools.
      
      Requested-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://www.youtube.com/watch?v=nXaxk27zwlk
      Link: http://lkml.kernel.org/n/tip-v8lq36aispvdwgxdmt9p9jd9@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      def02db0
    • Matt Fleming's avatar
      perf tests: Add Intel CQM test · 035827e9
      Matt Fleming authored
      
      
      Peter reports that it's possible to trigger a WARN_ON_ONCE() in the
      Intel CQM code by combining a hardware event and an Intel CQM
      (software) event into a group. Unfortunately, the perf tools are not
      able to create this bundle and we need to manually construct a test
      case.
      
      For posterity, record Peter's proof of concept test case in tools/perf
      so that it presents a model for how we can perform architecture
      specific tests, or "arch tests", in perf in the future.
      
      The particular issue triggered in the test case is that when the
      counter for the hardware event overflows and triggers a PMI we'll read
      both the hardware event and the software event counters.
      Unfortunately, for CQM that involves performing an IPI to read the CQM
      event counters on all sockets, which in NMI context triggers the
      WARN_ON_ONCE().
      
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Cc: Vince Weaver <vince@deater.net>
      Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
      Link: http://lkml.kernel.org/n/tip-3p4ra0u8vzm7m289a1m799kf@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      035827e9
    • Matt Fleming's avatar
      perf tests: Move x86 tests into arch directory · d8b167f9
      Matt Fleming authored
      
      
      Move out the x86-specific tests into tools/perf/arch/x86/tests and
      define an 'arch_tests' array, which is the list of tests that only apply
      to the build architecture.
      
      We can also now begin to get rid of some of the #ifdef code that is
      present in the generic perf tests.
      
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vikas Shivappa <vikas.shivappa@intel.com>
      Cc: Vince Weaver <vince@deater.net>
      Link: http://lkml.kernel.org/n/tip-9s68h4ptg06ah0lgnjz55mqn@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d8b167f9