Skip to content
  1. Dec 16, 2021
    • Namhyung Kim's avatar
      perf ftrace: Add 'latency' subcommand · 53be5028
      Namhyung Kim authored
      
      
      The perf ftrace latency is to get a histogram of function execution
      time.  Users should give a function name using -T option.
      
      This is implemented using function_graph tracer with the given
      function only.  And it parses the output to extract the time.
      
        $ sudo perf ftrace latency -a -T mutex_lock sleep 1
        #   DURATION     |      COUNT | GRAPH                          |
             0 - 1    us |       4596 | ########################       |
             1 - 2    us |       1680 | #########                      |
             2 - 4    us |       1106 | #####                          |
             4 - 8    us |        546 | ##                             |
             8 - 16   us |        562 | ###                            |
            16 - 32   us |          1 |                                |
            32 - 64   us |          0 |                                |
            64 - 128  us |          0 |                                |
           128 - 256  us |          0 |                                |
           256 - 512  us |          0 |                                |
           512 - 1024 us |          0 |                                |
             1 - 2    ms |          0 |                                |
             2 - 4    ms |          0 |                                |
             4 - 8    ms |          0 |                                |
             8 - 16   ms |          0 |                                |
            16 - 32   ms |          0 |                                |
            32 - 64   ms |          0 |                                |
            64 - 128  ms |          0 |                                |
           128 - 256  ms |          0 |                                |
           256 - 512  ms |          0 |                                |
           512 - 1024 ms |          0 |                                |
             1 - ...   s |          0 |                                |
      
      Committer testing:
      
      Latency for the __handle_mm_fault kernel function, system wide for 1
      second, see how one can go from the usual 'perf ftrace' output, now the
      same as for the 'perf ftrace trace' subcommand, to the new 'perf ftrace
      latency' subcommand:
      
        # perf ftrace -T __handle_mm_fault -a sleep 1 | wc -l
        709
        # perf ftrace -T __handle_mm_fault -a sleep 1 | wc -l
        510
        # perf ftrace -T __handle_mm_fault -a sleep 1 | head -20
        # tracer: function
        #
        # entries-in-buffer/entries-written: 0/0   #P:32
        #
        #           TASK-PID     CPU#     TIMESTAMP  FUNCTION
        #              | |         |         |         |
               perf-exec-1685104 [007]  90638.894613: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894620: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894622: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894635: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894688: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894702: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894714: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894728: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894740: __handle_mm_fault <-handle_mm_fault
               perf-exec-1685104 [007]  90638.894751: __handle_mm_fault <-handle_mm_fault
                   sleep-1685104 [007]  90638.894962: __handle_mm_fault <-handle_mm_fault
                   sleep-1685104 [007]  90638.894977: __handle_mm_fault <-handle_mm_fault
                   sleep-1685104 [007]  90638.894983: __handle_mm_fault <-handle_mm_fault
                   sleep-1685104 [007]  90638.894995: __handle_mm_fault <-handle_mm_fault
        # perf ftrace latency -T __handle_mm_fault -a sleep 1
        #   DURATION     |      COUNT | GRAPH                                          |
             0 - 1    us |        125 | ######                                         |
             1 - 2    us |        249 | #############                                  |
             2 - 4    us |        455 | ########################                       |
             4 - 8    us |         37 | #                                              |
             8 - 16   us |          0 |                                                |
            16 - 32   us |          0 |                                                |
            32 - 64   us |          0 |                                                |
            64 - 128  us |          0 |                                                |
           128 - 256  us |          0 |                                                |
           256 - 512  us |          0 |                                                |
           512 - 1024 us |          0 |                                                |
             1 - 2    ms |          0 |                                                |
             2 - 4    ms |          0 |                                                |
             4 - 8    ms |          0 |                                                |
             8 - 16   ms |          0 |                                                |
            16 - 32   ms |          0 |                                                |
            32 - 64   ms |          0 |                                                |
            64 - 128  ms |          0 |                                                |
           128 - 256  ms |          0 |                                                |
           256 - 512  ms |          0 |                                                |
           512 - 1024 ms |          0 |                                                |
             1 - ...   s |          0 |                                                |
        #
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211215185154.360314-4-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      53be5028
    • Namhyung Kim's avatar
      perf ftrace: Move out common code from __cmd_ftrace · a9b8ae8a
      Namhyung Kim authored
      
      
      The signal setup code and evlist__prepare_workload() can be used for
      other subcommands.  Let's move them out of the __cmd_ftrace().  Then
      it doesn't need to pass argc and argv.
      
      On the other hand, select_tracer() is specific to the 'trace'
      subcommand so it'd better moving it into the __cmd_ftrace().
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211215185154.360314-3-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a9b8ae8a
    • Namhyung Kim's avatar
      perf ftrace: Add 'trace' subcommand · 416e15ad
      Namhyung Kim authored
      
      
      This is a preparation to add more sub-commands for ftrace.  The
      'trace' subcommand does the same thing when no subcommand is given.
      
      Committer testing:
      
      The previous mode, i.e. no subcommand and the new 'perf ftrace trace'
      are equivalent:
      
        # perf ftrace -G check_preempt_curr sleep 0.00001
        # tracer: function_graph
        #
        # CPU  DURATION                  FUNCTION CALLS
        # |     |   |                     |   |   |   |
         25)               |  check_preempt_curr() {
         25)               |    resched_curr() {
         25)               |      native_smp_send_reschedule() {
         25)               |        default_send_IPI_single_phys() {
         25)   0.110 us    |          __default_send_IPI_dest_field();
         25)   0.490 us    |        }
         25)   0.640 us    |      }
         25)   0.850 us    |    }
         25)   2.060 us    |  }
        # perf ftrace trace -G check_preempt_curr sleep 0.00001
        # tracer: function_graph
        #
        # CPU  DURATION                  FUNCTION CALLS
        # |     |   |                     |   |   |   |
         10)               |  check_preempt_curr() {
         10)               |    resched_curr() {
         10)               |      native_smp_send_reschedule() {
         10)               |        default_send_IPI_single_phys() {
         10)   0.080 us    |          __default_send_IPI_dest_field();
         10)   0.460 us    |        }
         10)   0.610 us    |      }
         10)   0.830 us    |    }
         10)   2.020 us    |  }
        #
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211215185154.360314-2-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      416e15ad
    • German Gomez's avatar
      perf arch: Support register names from all archs · 83869019
      German Gomez authored
      
      
      When reading a perf.data file with register values, there is a mismatch
      between the names and the values of the registers because the tool is
      built using only the register names from the local architecture.
      
      Reading a perf.data file that was recorded on ARM64, gives the following
      erroneous output on an X86 machine:
      
        # perf report -i perf_arm64.data -D
        [...]
        24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
        ... user regs: mask 0x1ffffffff ABI 64-bit
        .... AX    0x0000ffffd1515817
        .... BX    0x0000ffffd1515480
        .... CX    0x0000aaaadabf6c80
        .... DX    0x000000000000002e
        .... SI    0x0000000040100401
        .... DI    0x0040600200000080
        .... BP    0x0000ffffd1510e10
        .... SP    0x0000000000000000
        .... IP    0x00000000000000dd
        .... FLAGS 0x0000ffffd1510cd0
        .... CS    0x0000000000000000
        .... SS    0x0000000000000030
        .... DS    0x0000ffffa569a208
        .... ES    0x0000000000000000
        .... FS    0x0000000000000000
        .... GS    0x0000000000000000
        .... R8    0x0000aaaad3de9650
        .... R9    0x0000ffffa57397f0
        .... R10   0x0000000000000001
        .... R11   0x0000ffffa57fd000
        .... R12   0x0000ffffd1515817
        .... R13   0x0000ffffd1515480
        .... R14   0x0000aaaadabf6c80
        .... R15   0x0000000000000000
        .... unknown 0x0000000000000001
        .... unknown 0x0000000000000000
        .... unknown 0x0000000000000000
        .... unknown 0x0000000000000000
        .... unknown 0x0000000000000000
        .... unknown 0x0000ffffd1510d90
        .... unknown 0x0000ffffa5739b90
        .... unknown 0x0000ffffd1510d80
        .... XMM0  0x0000ffffa57392c8
         ... thread: perf-exec:43239
         ...... dso: [kernel.kallsyms]
      
      As can be seen, the register names correspond to X86 registers, even
      though the perf.data file was recorded on an ARM64 system. After this
      patch, the output of the command displays the correct register names:
      
        # perf report -i perf_arm64.data -D
        [...]
        24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
        ... user regs: mask 0x1ffffffff ABI 64-bit
        .... x0    0x0000ffffd1515817
        .... x1    0x0000ffffd1515480
        .... x2    0x0000aaaadabf6c80
        .... x3    0x000000000000002e
        .... x4    0x0000000040100401
        .... x5    0x0040600200000080
        .... x6    0x0000ffffd1510e10
        .... x7    0x0000000000000000
        .... x8    0x00000000000000dd
        .... x9    0x0000ffffd1510cd0
        .... x10   0x0000000000000000
        .... x11   0x0000000000000030
        .... x12   0x0000ffffa569a208
        .... x13   0x0000000000000000
        .... x14   0x0000000000000000
        .... x15   0x0000000000000000
        .... x16   0x0000aaaad3de9650
        .... x17   0x0000ffffa57397f0
        .... x18   0x0000000000000001
        .... x19   0x0000ffffa57fd000
        .... x20   0x0000ffffd1515817
        .... x21   0x0000ffffd1515480
        .... x22   0x0000aaaadabf6c80
        .... x23   0x0000000000000000
        .... x24   0x0000000000000001
        .... x25   0x0000000000000000
        .... x26   0x0000000000000000
        .... x27   0x0000000000000000
        .... x28   0x0000000000000000
        .... x29   0x0000ffffd1510d90
        .... lr    0x0000ffffa5739b90
        .... sp    0x0000ffffd1510d80
        .... pc    0x0000ffffa57392c8
         ... thread: perf-exec:43239
         ...... dso: [kernel.kallsyms]
      
      Tester comments:
      
      Athira reports:
      
      "Looks good to me. Tested this patchset in powerpc by capturing regs in
      powerpc and doing perf report to read the data from x86."
      
      Reported-by: default avatarAlexandre Truong <alexandre.truong@arm.com>
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-csky@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      Link: https://lore.kernel.org/r/20211207180653.1147374-4-german.gomez@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      83869019
    • German Gomez's avatar
      perf arm64: Rename perf_event_arm_regs for ARM64 registers · d3b58af9
      German Gomez authored
      
      
      The registers for ARM and ARM64 are enumerated using two enums that have
      the same name. In order to be able to import both headers, the name of
      one can be replaced using the C preprocessor like so:
      
        #define perf_event_arm_regs perf_event_arm64_regs
        #include <asm/perf_regs.h>
        #undef perf_event_arm_regs
      
      This patch updates all imports of ARM64's perf_regs.h in order to
      prevent the naming collision.
      
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-csky@vger.kernel.org
      Cc: linux-riscv@lists.infradead.org
      Link: https://lore.kernel.org/r/20211207180653.1147374-3-german.gomez@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3b58af9
    • Leo Yan's avatar
      perf namespaces: Add helper nsinfo__is_in_root_namespace() · 5d28a17c
      Leo Yan authored
      
      
      Refactors code for gathering PID infos, it creates the function
      nsinfo__get_nspid() to parse process 'status' node in folder '/proc'.
      
      Base on the refactoring, this patch introduces a new helper
      nsinfo__is_in_root_namespace(), it returns true when the caller runs in
      the root PID namespace.
      
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211212134721.1721245-2-leo.yan@linaro.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5d28a17c
    • Colin Ian King's avatar
      libperf tests: Fix a spelling mistake "Runnnig" -> "Running" · 017f7d1f
      Colin Ian King authored
      
      
      There is a spelling mistake in a __T_VERBOSE message. Fix it.
      
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-janitors@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211212222122.478537-1-colin.i.king@gmail.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      017f7d1f
    • Miaoqian Lin's avatar
      perf bpf-loader: Use IS_ERR_OR_NULL() to clean code and fix check · 8acf3793
      Miaoqian Lin authored
      
      
      Use IS_ERR_OR_NULL() to make the code cleaner.
      Also if the priv is NULL, it's improper to call PTR_ERR(priv).
      
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: unlisted-recipients
      Link: http://lore.kernel.org/lkml/20211212135613.20000-1-linmq006@gmail.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8acf3793
    • James Clark's avatar
      perf cs-etm: Remove duplicate and incorrect aux size checks · 7cc9680c
      James Clark authored
      
      
      There are two checks, one is for size when running without admin, but
      this one is covered by the driver and reported on in more detail here
      (builtin-record.c):
      
        pr_err("Permission error mapping pages.\n"
               "Consider increasing "
               "/proc/sys/kernel/perf_event_mlock_kb,\n"
               "or try again with a smaller value of -m/--mmap_pages.\n"
               "(current value: %u,%u)\n",
      
      This had the effect of artificially limiting the aux buffer size to a
      value smaller than what was allowed because perf_event_mlock_kb wasn't
      taken into account.
      
      The second is to check for a power of two, but this is covered here
      (evlist.c):
      
        pr_info("rounding mmap pages size to %s (%lu pages)\n",
                buf, pages);
      
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20211208115435.610101-1-james.clark@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7cc9680c
    • Andrew Kilroy's avatar
      perf vendor events: Rename arm64 arch std event files · 6732f10b
      Andrew Kilroy authored
      
      
      A previous commit adds pmu events into the files
      
        armv8-common-and-microarch.json
        armv8-recommended.json
      
      that are actually specified in an armv9 reference supplement, not armv8.
      As such, naming the files with the armv8 prefix seems artificial.
      
      This patch renames the files to reflect that these two files are for
      arch std events regardless of whether they are defined in armv8 or
      armv9.
      
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarAndrew Kilroy <andrew.kilroy@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20211210123706.7490-3-andrew.kilroy@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6732f10b
    • Andrew Kilroy's avatar
      perf vendor events: For the Arm Neoverse N2 · 3987d65f
      Andrew Kilroy authored
      Updates the common and microarch json file to add counters available in
      the Arm Neoverse N2 chip, but should also apply to other ArmV8 and ArmV9
      cpus.  Specified in ArmV8 architecture reference manual
      
        https://developer.arm.com/documentation/ddi0487/gb/?lang=en
      
      Some of the counters added to armv8-common-and-microarch.json are
      specified in the ArmV9 architecture reference manual supplement
      (issue A.a):
      
        https://developer.arm.com/documentation/ddi0608/aa
      
      The additional ArmV9 counters are
      
        TRB_WRAP
        TRCEXTOUT0
        TRCEXTOUT1
        TRCEXTOUT2
        TRCEXTOUT3
        CTI_TRIGOUT4
        CTI_TRIGOUT5
        CTI_TRIGOUT6
        CTI_TRIGOUT7
      
      This patch also adds files in pmu-events/arch/arm64/arm/neoverse-n2 for
      perf list to output the counter names in categories.
      
      Counters on the Neoverse N2 are stated in its reference manual:
      
        https://developer.arm.com/documentation/102099/0000
      
      
      
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarAndrew Kilroy <andrew.kilroy@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20211210123706.7490-2-andrew.kilroy@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3987d65f
    • Salvatore Bonaccorso's avatar
      perf dlfilter: Drop unused variable · 888569db
      Salvatore Bonaccorso authored
      
      
      Compiling tools/perf/dlfilters/dlfilter-test-api-v0.c result in:
      
      	checking for stdlib.h... dlfilters/dlfilter-test-api-v0.c: In function ‘filter_event’:
      	dlfilters/dlfilter-test-api-v0.c:311:29: warning: unused variable ‘d’ [-Wunused-variable]
      	  311 |         struct filter_data *d = data;
      	      |
      
      So remove the  variable now.
      
      Reviewed-by: default avatarGerman Gomez <german.gomez@arm.com>
      Signed-off-by: default avatarSalvatore Bonaccorso <carnil@debian.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20211123211821.132924-1-carnil@debian.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      888569db
    • Namhyung Kim's avatar
      perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT · b0fde9c6
      Namhyung Kim authored
      
      
      Use total latency info in the SPE counter packet as sample weight so
      that we can see it in local_weight and (global) weight sort keys.
      
      Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well
      but I'm not sure which latency it matches.  So just adding total latency
      first.
      
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0fde9c6
    • Sohaib Mohamed's avatar
      perf bench: Use unbuffered output when pipe/tee'ing to a file · f0a29c96
      Sohaib Mohamed authored
      
      
      The output of 'perf bench' gets buffered when I pipe it to a file or to
      tee, in such a way that I can see it only at the end.
      
      E.g.
      
        $ perf bench internals synthesize -t
        < output comes out fine after each test run >
      
        $ perf bench internals synthesize -t | tee file.txt
        < output comes out only at the end of all tests >
      
      This patch resolves this issue for 'bench' and 'test' subcommands.
      
      See, also:
      
        $ perf bench mem all | tee file.txt
        $ perf bench sched all | tee file.txt
        $ perf bench internals all -t | tee file.txt
        $ perf bench internals all | tee file.txt
      
      Committer testing:
      
      It really gets staggered, i.e. outputs in bursts, when the buffer fills
      up and has to be drained to make up space for more output.
      
      Suggested-by: default avatarRiccardo Mancini <rickyman7@gmail.com>
      Signed-off-by: default avatarSohaib Mohamed <sohaib.amhmd@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Fabian Hemmer <copy@copy.sh>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20211119061409.78004-1-sohaib.amhmd@gmail.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f0a29c96
    • Arnaldo Carvalho de Melo's avatar
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client · 2b14864a
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "An SGID directory handling fix (marked for stable), a metrics
        accounting fix and two fixups to appease static checkers"
      
      * tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client:
        ceph: fix up non-directory creation in SGID directories
        ceph: initialize pathlen variable in reconnect_caps_cb
        ceph: initialize i_size variable in ceph_sync_read
        ceph: fix duplicate increment of opened_inodes metric
      2b14864a
    • Linus Torvalds's avatar
      Merge tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · d9c1e640
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Add missing handling of R_390_PLT32DBL relocation type in
         arch_kexec_apply_relocations_add(). Clang and the upcoming gcc 11.3
         generate such relocation entries, which our relocation code silently
         ignores, and which finally will result in an endless loop within the
         purgatory code in case of kexec.
      
       - Add proper handling of errors and print error messages when applying
         relocations
      
       - Fix duplicate tracking of irq nesting level in entry code
      
       - Let recordmcount.pl also look for jgnop mnemonic. Starting with
         binutils 2.37 objdump emits a jgnop mnemonic instead of brcl, which
         breaks mcount location detection. This is only a problem if used with
         compilers older than gcc 9, since with gcc 9 and newer compilers
         recordmcount.pl is not used anymore.
      
       - Remove preempt_disable()/preempt_enable() pair in
         kprobe_ftrace_handler() which was done for all architectures except
         for s390.
      
       - Update defconfig
      
      * tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        recordmcount.pl: look for jgnop instruction as well as bcrl on s390
        s390/entry: fix duplicate tracking of irq nesting level
        s390: enable switchdev support in defconfig
        s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()
        s390/ftrace: remove preempt_disable()/preempt_enable() pair
        s390/kexec_file: fix error handling when applying relocations
        s390/kexec_file: print some more error messages
      d9c1e640
    • Linus Torvalds's avatar
      Merge tag 'hyperv-fixes-signed-20211214' of... · 213d9d4c
      Linus Torvalds authored
      Merge tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv fix from Wei Liu:
       "Build fix from Randy Dunlap"
      
      * tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        hv: utils: add PTP_1588_CLOCK to Kconfig to fix build
      213d9d4c
  2. Dec 14, 2021
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 5472f14a
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "Misc virtio and vdpa bugfixes"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa: Consider device id larger than 31
        virtio/vsock: fix the transport to work with VMADDR_CID_ANY
        virtio_ring: Fix querying of maximum DMA mapping size for virtio device
        virtio: always enter drivers/virtio/
        vduse: check that offset is within bounds in get_config()
        vdpa: check that offsets are within bounds
        vduse: fix memory corruption in vduse_dev_ioctl()
      5472f14a
    • Sergio Paracuellos's avatar
      PCI: mt7621: Convert driver into 'bool' · aa50faff
      Sergio Paracuellos authored
      The driver is not ready yet to be compiled as a module since it depends
      on some symbols not exported on MIPS.  We have the following current
      problems:
      
        Building mips:allmodconfig ... failed
        --------------
        Error log:
        ERROR: modpost: missing MODULE_LICENSE() in drivers/pci/controller/pcie-mt7621.o
        ERROR: modpost: "mips_cm_unlock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined!
        ERROR: modpost: "mips_cpc_base" [drivers/pci/controller/pcie-mt7621.ko] undefined!
        ERROR: modpost: "mips_cm_lock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined!
        ERROR: modpost: "mips_cm_is64" [drivers/pci/controller/pcie-mt7621.ko] undefined!
        ERROR: modpost: "mips_gcr_base" [drivers/pci/controller/pcie-mt7621.ko] undefined!
      
      Temporarily move from 'tristate' to 'bool' until a better solution is
      ready.
      
      Also RALINK is redundant because SOC_MT7621 already depends on it.
      Hence, simplify condition.
      
      Fixes: 2bdd5238
      
       ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver").
      Signed-off-by: default avatarSergio Paracuellos <sergio.paracuellos@gmail.com>
      Reviewed-and-Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa50faff
    • Linus Torvalds's avatar
      fget: clarify and improve __fget_files() implementation · e386dfc5
      Linus Torvalds authored
      Commit 054aa8d4 ("fget: check that the fd still exists after getting
      a ref to it") fixed a race with getting a reference to a file just as it
      was being closed.  It was a fairly minimal patch, and I didn't think
      re-checking the file pointer lookup would be a measurable overhead,
      since it was all right there and cached.
      
      But I was wrong, as pointed out by the kernel test robot.
      
      The 'poll2' case of the will-it-scale.per_thread_ops benchmark regressed
      quite noticeably.  Admittedly it seems to be a very artificial test:
      doing "poll()" system calls on regular files in a very tight loop in
      multiple threads.
      
      That means that basically all the time is spent just looking up file
      descriptors without ever doing anything useful with them (not that doing
      'poll()' on a regular file is useful to begin with).  And as a result it
      shows the extra "re-check fd" cost as a sore thumb.
      
      Happily, the regression is fixable by just writing the code to loook up
      the fd to be better and clearer.  There's still a cost to verify the
      file pointer, but now it's basically in the noise even for that
      benchmark that does nothing else - and the code is more understandable
      and has better comments too.
      
      [ Side note: this patch is also a classic case of one that looks very
        messy with the default greedy Myers diff - it's much more legible with
        either the patience of histogram diff algorithm ]
      
      Link: https://lore.kernel.org/lkml/20211210053743.GA36420@xsang-OptiPlex-9020/
      Link: https://lore.kernel.org/lkml/20211213083154.GA20853@linux.intel.com/
      
      
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Tested-by: default avatarCarel Si <beibei.si@intel.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e386dfc5
  3. Dec 13, 2021
  4. Dec 12, 2021