Skip to content
  1. Dec 14, 2021
    • Waiman Long's avatar
      cgroup/cpuset: Don't let child cpusets restrict parent in default hierarchy · 1f1562fc
      Waiman Long authored
      
      
      In validate_change(), there is a check since v2.6.12 to make sure that
      each of the child cpusets must be a subset of a parent cpuset.  IOW, it
      allows child cpusets to restrict what changes can be made to a parent's
      "cpuset.cpus". This actually violates one of the core principles of the
      default hierarchy where a cgroup higher up in the hierarchy should be
      able to change configuration however it sees fit as deligation breaks
      down otherwise.
      
      To address this issue, the check is now removed for the default hierarchy
      to free parent cpusets from being restricted by child cpusets. The
      check will still apply for legacy hierarchy.
      
      Suggested-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      1f1562fc
  2. Dec 02, 2021
  3. Nov 30, 2021
  4. Nov 16, 2021
  5. Nov 15, 2021
  6. Nov 14, 2021
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.16-1' of... · 979292af
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
        - Address an issue with the SiFive PLIC being unable to EOI
          a masked interrupt
      
        - Move the disable/enable methods in the CSky mpintc to
          mask/unmask
      
        - Fix a regression in the OF irq code where an interrupt-controller
          property in the same node as an interrupt-map property would get
          ignored
      
      Link: https://lore.kernel.org/all/20211112173459.4015233-1-maz@kernel.org
      979292af
    • Linus Torvalds's avatar
      Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux · c8c10954
      Linus Torvalds authored
      Pull zstd update from Nick Terrell:
       "Update to zstd-1.4.10.
      
        Add myself as the maintainer of zstd and update the zstd version in
        the kernel, which is now 4 years out of date, to a much more recent
        zstd release. This includes bug fixes, much more extensive fuzzing,
        and performance improvements. And generates the kernel zstd
        automatically from upstream zstd, so it is easier to keep the zstd
        verison up to date, and we don't fall so far out of date again.
      
        This includes 5 commits that update the zstd library version:
      
         - Adds a new kernel-style wrapper around zstd.
      
           This wrapper API is functionally equivalent to the subset of the
           current zstd API that is currently used. The wrapper API changes to
           be kernel style so that the symbols don't collide with zstd's
           symbols. The update to zstd-1.4.10 maintains the same API and
           preserves the semantics, so that none of the callers need to be
           updated. All callers are updated in the commit, because there are
           zero functional changes.
      
         - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
           depend on the layout of `lib/zstd/` to include every source file.
           This allows the next patch to be automatically generated.
      
         - Imports the zstd-1.4.10 source code. This commit is automatically
           generated from upstream zstd (https://github.com/facebook/zstd).
      
         - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
      
         - Fixes a newly added build warning for clang.
      
        The discussion around this patchset has been pretty long, so I've
        included a FAQ-style summary of the history of the patchset, and why
        we are taking this approach.
      
        Why do we need to update?
        -------------------------
      
        The zstd version in the kernel is based off of zstd-1.3.1, which is
        was released August 20, 2017. Since then zstd has seen many bug fixes
        and performance improvements. And, importantly, upstream zstd is
        continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
        older versions. So the only way to sanely get these fixes is to keep
        up to date with upstream zstd.
      
        There are no known security issues that affect the kernel, but we need
        to be able to update in case there are. And while there are no known
        security issues, there are relevant bug fixes. For example the problem
        with large kernel decompression has been fixed upstream for over 2
        years [1]
      
        Additionally the performance improvements for kernel use cases are
        significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
      
         - BtrFS zstd compression at levels 1 and 3 is 5% faster
      
         - BtrFS zstd decompression+read is 15% faster
      
         - SquashFS zstd decompression+read is 15% faster
      
         - F2FS zstd compression+write at level 3 is 8% faster
      
         - F2FS zstd decompression+read is 20% faster
      
         - ZRAM decompression+read is 30% faster
      
         - Kernel zstd decompression is 35% faster
      
         - Initramfs zstd decompression+build is 5% faster
      
        On top of this, there are significant performance improvements coming
        down the line in the next zstd release, and the new automated update
        patch generation will allow us to pull them easily.
      
        How is the update patch generated?
        ----------------------------------
      
        The first two patches are preparation for updating the zstd version.
        Then the 3rd patch in the series imports upstream zstd into the
        kernel. This patch is automatically generated from upstream. A script
        makes the necessary changes and imports it into the kernel. The
        changes are:
      
         - Replace all libc dependencies with kernel replacements and rewrite
           includes.
      
         - Remove unncessary portability macros like: #if defined(_MSC_VER).
      
         - Use the kernel xxhash instead of bundling it.
      
        This automation gets tested every commit by upstream's continuous
        integration. When we cut a new zstd release, we will submit a patch to
        the kernel to update the zstd version in the kernel.
      
        The automated process makes it easy to keep the kernel version of zstd
        up to date. The current zstd in the kernel shares the guts of the
        code, but has a lot of API and minor changes to work in the kernel.
        This is because at the time upstream zstd was not ready to be used in
        the kernel envrionment as-is. But, since then upstream zstd has
        evolved to support being used in the kernel as-is.
      
        Why are we updating in one big patch?
        -------------------------------------
      
        The 3rd patch in the series is very large. This is because it is
        restructuring the code, so it both deletes the existing zstd, and
        re-adds the new structure. Future updates will be directly
        proportional to the changes in upstream zstd since the last import.
        They will admittidly be large, as zstd is an actively developed
        project, and has hundreds of commits between every release. However,
        there is no other great alternative.
      
        One option ruled out is to replay every upstream zstd commit. This is
        not feasible for several reasons:
      
         - There are over 3500 upstream commits since the zstd version in the
           kernel.
      
         - The automation to automatically generate the kernel update was only
           added recently, so older commits cannot easily be imported.
      
         - Not every upstream zstd commit builds.
      
         - Only zstd releases are "supported", and individual commits may have
           bugs that were fixed before a release.
      
        Another option to reduce the patch size would be to first reorganize
        to the new file structure, and then apply the patch. However, the
        current kernel zstd is formatted with clang-format to be more
        "kernel-like". But, the new method imports zstd as-is, without
        additional formatting, to allow for closer correlation with upstream,
        and easier debugging. So the patch wouldn't be any smaller.
      
        It also doesn't make sense to import upstream zstd commit by commit
        going forward. Upstream zstd doesn't support production use cases
        running of the development branch. We have a lot of post-commit
        fuzzing that catches many bugs, so indiviudal commits may be buggy,
        but fixed before a release. So going forward, I intend to import every
        (important) zstd release into the Kernel.
      
        So, while it isn't ideal, updating in one big patch is the only patch
        I see forward.
      
        Who is responsible for this code?
        ---------------------------------
      
        I am. This patchset adds me as the maintainer for zstd. Previously,
        there was no tree for zstd patches. Because of that, there were
        several patches that either got ignored, or took a long time to merge,
        since it wasn't clear which tree should pick them up. I'm officially
        stepping up as maintainer, and setting up my tree as the path through
        which zstd patches get merged. I'll make sure that patches to the
        kernel zstd get ported upstream, so they aren't erased when the next
        version update happens.
      
        How is this code tested?
        ------------------------
      
        I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
        Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
        aarch64. I checked both performance and correctness.
      
        Also, thanks to many people in the community who have tested these
        patches locally.
      
        Lastly, this code will bake in linux-next before being merged into
        v5.16.
      
        Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
        ------------------------------------------------------------
      
        This patchset has been outstanding since 2020, and zstd-1.4.10 was the
        latest release when it was created. Since the update patch is
        automatically generated from upstream, I could generate it from
        zstd-1.5.0.
      
        However, there were some large stack usage regressions in zstd-1.5.0,
        and are only fixed in the latest development branch. And the latest
        development branch contains some new code that needs to bake in the
        fuzzer before I would feel comfortable releasing to the kernel.
      
        Once this patchset has been merged, and we've released zstd-1.5.1, we
        can update the kernel to zstd-1.5.1, and exercise the update process.
      
        You may notice that zstd-1.4.10 doesn't exist upstream. This release
        is an artifical release based off of zstd-1.4.9, with some fixes for
        the kernel backported from the development branch. I will tag the
        zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
        is running a known version of zstd that can be debugged upstream.
      
        Why was a wrapper API added?
        ----------------------------
      
        The first versions of this patchset migrated the kernel to the
        upstream zstd API. It first added a shim API that supported the new
        upstream API with the old code, then updated callers to use the new
        shim API, then transitioned to the new code and deleted the shim API.
        However, Cristoph Hellwig suggested that we transition to a kernel
        style API, and hide zstd's upstream API behind that. This is because
        zstd's upstream API is supports many other use cases, and does not
        follow the kernel style guide, while the kernel API is focused on the
        kernel's use cases, and follows the kernel style guide.
      
        Where is the previous discussion?
        ---------------------------------
      
        Links for the discussions of the previous versions of the patch set
        below. The largest changes in the design of the patchset are driven by
        the discussions in v11, v5, and v1. Sorry for the mix of links, I
        couldn't find most of the the threads on lkml.org"
      
      Link: https://lkml.org/lkml/2020/9/29/27 [1]
      Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
      Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
      Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
      Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
      Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8]
      Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
      Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
      Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
      Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
      Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
      Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
      Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
      
       [v1]
      Signed-off-by: default avatarNick Terrell <terrelln@fb.com>
      Tested By: Paul Jones <paul@pauljones.id.au>
      Tested-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
      Tested-by: default avatarJean-Denis Girard <jd.girard@sysnux.pf>
      
      * tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
        lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
        MAINTAINERS: Add maintainer entry for zstd
        lib: zstd: Upgrade to latest upstream zstd version 1.4.10
        lib: zstd: Add decompress_sources.h for decompress_unzstd
        lib: zstd: Add kernel-specific API
      c8c10954
    • Linus Torvalds's avatar
      Merge tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux · ccfff0a2
      Linus Torvalds authored
      Pull virtio-mem update from David Hildenbrand:
       "Support the VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature in virtio-mem,
        now that "accidential" access to logically unplugged memory inside
        added Linux memory blocks is no longer possible, because we:
      
         - Removed /dev/kmem in commit bbcd53c9 ("drivers/char: remove
           /dev/kmem for good")
      
         - Disallowed access to virtio-mem device memory via /dev/mem in
           commit 2128f4e2 ("virtio-mem: disallow mapping virtio-mem memory
           via /dev/mem")
      
         - Sanitized access to virtio-mem device memory via /proc/kcore in
           commit 0daa322b ("fs/proc/kcore: don't read offline sections,
           logically offline pages and hwpoisoned pages")
      
         - Sanitized access to virtio-mem device memory via /proc/vmcore in
           commit ce281462 ("virtio-mem: kdump mode to sanitize
           /proc/vmcore access")
      
        The new VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature that will be
        required by some hypervisors implementing virtio-mem in the near
        future, so let's support it now that we safely can"
      
      * tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux:
        virtio-mem: support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
      ccfff0a2
    • James Clark's avatar
      perf tests: Remove bash constructs from stat_all_pmu.sh · ac96f463
      James Clark authored
      
      
      The tests were passing but without testing and were printing the
      following:
      
        $ ./perf test -v 90
        90: perf all PMU test                                               :
        --- start ---
        test child forked, pid 51650
        Testing cpu/branch-instructions/
        ./tests/shell/stat_all_pmu.sh: 10: [:
         Performance counter stats for 'true':
      
                   137,307      cpu/branch-instructions/
      
               0.001686672 seconds time elapsed
      
               0.001376000 seconds user
               0.000000000 seconds sys: unexpected operator
      
      Changing the regexes to a grep works in sh and prints this:
      
        $ ./perf test -v 90
        90: perf all PMU test                                               :
        --- start ---
        test child forked, pid 60186
        [...]
        Testing tlb_flush.stlb_any
        test child finished with 0
        ---- end ----
        perf all PMU test: Ok
      
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-4-james.clark@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ac96f463
    • James Clark's avatar
      perf tests: Remove bash construct from record+zstd_comp_decomp.sh · a9cdc1c5
      James Clark authored
      
      
      Commit 463538a3 ("perf tests: Fix test 68 zstd compression for
      s390") inadvertently removed the -g flag from all platforms rather than
      just s390, because the [[ ]] construct fails in sh. Changing to single
      brackets restores testing of call graphs and removes the following error
      from the output:
      
        $ ./perf test -v 85
        85: Zstd perf.data compression/decompression                        :
        --- start ---
        test child forked, pid 50643
        Collecting compressed record file:
        ./tests/shell/record+zstd_comp_decomp.sh: 15: [[: not found
      
      Fixes: 463538a3 ("perf tests: Fix test 68 zstd compression for s390")
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-3-james.clark@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a9cdc1c5
    • James Clark's avatar
      perf test: Remove bash construct from stat_bpf_counters.sh test · c8b94764
      James Clark authored
      
      
      Currently the test skips with an error because == only works in bash:
      
        $ ./perf test 91 -v
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        91: perf stat --bpf-counters test                                   :
        --- start ---
        test child forked, pid 44586
        ./tests/shell/stat_bpf_counters.sh: 26: [: -v: unexpected operator
        test child finished with -2
        ---- end ----
        perf stat --bpf-counters test: Skip
      
      Changing == to = does the same thing, but doesn't result in an error:
      
        ./perf test 91 -v
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        91: perf stat --bpf-counters test                                   :
        --- start ---
        test child forked, pid 45833
        Skipping: --bpf-counters not supported
          Error: unknown option `bpf-counters'
        [...]
        test child finished with -2
        ---- end ----
        perf stat --bpf-counters test: Skip
      
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-2-james.clark@arm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c8b94764
    • Sohaib Mohamed's avatar
      perf bench futex: Fix memory leak of perf_cpu_map__new() · 88e48238
      Sohaib Mohamed authored
      
      
      ASan reports memory leaks while running:
      
        $ sudo ./perf bench futex all
      
      The leaks are caused by perf_cpu_map__new not being freed.
      This patch adds the missing perf_cpu_map__put since it calls
      cpu_map_delete implicitly.
      
      Fixes: 9c3516d1 ("libperf: Add perf_cpu_map__new()/perf_cpu_map__read() functions")
      Signed-off-by: default avatarSohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lore.kernel.org/lkml/20211112201134.77892-1-sohaib.amhmd@gmail.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88e48238
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync the msr-index.h copy with the kernel sources · 3442b5e0
      Arnaldo Carvalho de Melo authored
      To pick up the changes in:
      
        dae1bd58 ("x86/msr-index: Add MSRs for XFD")
      
      Addressing these tools/perf build warnings:
      
          diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
          Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
      
      That makes the beautification scripts to pick some new entries:
      
        $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
        --- tools/arch/x86/include/asm/msr-index.h	2021-07-15 16:17:01.819817827 -0300
        +++ arch/x86/include/asm/msr-index.h	2021-11-06 15:49:33.738517311 -0300
        @@ -625,6 +625,8 @@
      
         #define MSR_IA32_BNDCFGS_RSVD		0x00000ffc
      
        +#define MSR_IA32_XFD			0x000001c4
        +#define MSR_IA32_XFD_ERR		0x000001c5
         #define MSR_IA32_XSS			0x00000da0
      
         #define MSR_IA32_APICBASE		0x0000001b
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/after
        $ diff -u /tmp/before /tmp/after
        --- /tmp/before	2021-11-13 11:10:39.964201505 -0300
        +++ /tmp/after	2021-11-13 11:10:47.902410873 -0300
        @@ -93,6 +93,8 @@
         	[0x000001b0] = "IA32_ENERGY_PERF_BIAS",
         	[0x000001b1] = "IA32_PACKAGE_THERM_STATUS",
         	[0x000001b2] = "IA32_PACKAGE_THERM_INTERRUPT",
        +	[0x000001c4] = "IA32_XFD",
        +	[0x000001c5] = "IA32_XFD_ERR",
         	[0x000001c8] = "LBR_SELECT",
         	[0x000001c9] = "LBR_TOS",
         	[0x000001d9] = "IA32_DEBUGCTLMSR",
        $
      
      And this gets rebuilt:
      
        CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
        INSTALL  trace_plugins
        LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
        LD       /tmp/build/perf/trace/beauty/perf-in.o
        LD       /tmp/build/perf/perf-in.o
        LINK     /tmp/build/perf/perf
      
      Now one can trace systemwide asking to see backtraces to where those
      MSRs are being read/written with:
      
        # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
        ^C#
        #
      
      If we use -v (verbose mode) we can see what it does behind the scenes:
      
        # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
        <SNIP>
        New filter for msr:read_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
        New filter for msr:write_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
        <SNIP>
        ^C#
      
      Example with a frequent msr:
      
        # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
        Using CPUID AuthenticAMD-25-21-0
        0x48
        New filter for msr:read_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
        0x48
        New filter for msr:write_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
        mmap size 528384B
        Looking at the vmlinux_path (8 entries long)
        symsrc__init: build id mismatch for vmlinux.
        Using /proc/kcore for kernel data
        Using /proc/kallsyms for symbols
             0.000 pipewire/2479 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_trace_write_msr ([kernel.kallsyms])
                                               __switch_to_xtra ([kernel.kallsyms])
                                               __switch_to ([kernel.kallsyms])
                                               __schedule ([kernel.kallsyms])
                                               schedule ([kernel.kallsyms])
                                               schedule_hrtimeout_range_clock ([kernel.kallsyms])
                                               do_epoll_wait ([kernel.kallsyms])
                                               __x64_sys_epoll_wait ([kernel.kallsyms])
                                               do_syscall_64 ([kernel.kallsyms])
                                               entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                               epoll_wait (/usr/lib64/libc-2.33.so)
                                               [0x76c4] (/usr/lib64/spa-0.2/support/libspa-support.so)
                                               [0x4cf0] (/usr/lib64/spa-0.2/support/libspa-support.so)
             0.027 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_trace_write_msr ([kernel.kallsyms])
                                               __switch_to_xtra ([kernel.kallsyms])
                                               __switch_to ([kernel.kallsyms])
                                               __schedule ([kernel.kallsyms])
                                               schedule_idle ([kernel.kallsyms])
                                               do_idle ([kernel.kallsyms])
                                               cpu_startup_entry ([kernel.kallsyms])
                                               start_kernel ([kernel.kallsyms])
                                               secondary_startup_64_no_verify ([kernel.kallsyms])
        #
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/YY%2FJdb6on7swsn+C@kernel.org/
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3442b5e0
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync drm/i915_drm.h with the kernel sources · 06cf00c4
      Arnaldo Carvalho de Melo authored
      
      
      To pick up the changes in:
      
        e5e32171 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
        9409eb35 ("drm/i915: Expose logical engine instance to user")
        ea673f17 ("drm/i915/uapi: Add comment clarifying purpose of I915_TILING_* values")
        d3ac8d42 ("drm/i915/pxp: interfaces for using protected objects")
        cbbd3764 ("drm/i915/pxp: Create the arbitrary session after boot")
      
      That don't add any new ioctl, so no changes in tooling.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
        diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
      
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Huang, Sean Z <sean.z.huang@intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06cf00c4
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync sound/asound.h with the kernel sources · 37057e74
      Arnaldo Carvalho de Melo authored
      
      
      To pick up the changes in:
      
        5aec579e ("ALSA: uapi: Fix a C++ style comment in asound.h")
      
      That is just changing a // style comment to /* */.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
        diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
      
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      37057e74
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync linux/prctl.h with the kernel sources · 49024204
      Arnaldo Carvalho de Melo authored
      
      
      To pick the changes in:
      
        61bc346c ("uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument")
      
      That don't result in any changes in tooling:
      
        $ tools/perf/trace/beauty/prctl_option.sh > before
        $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
        $ tools/perf/trace/beauty/prctl_option.sh > after
        $ diff -u before after
        $
      
      Just silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
        diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Eugene Syromiatnikov <esyr@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49024204
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync arch prctl headers with the kernel sources · 5b749efe
      Arnaldo Carvalho de Melo authored
      To pick the changes in this cset:
      
        db8268df ("x86/arch_prctl: Add controls for dynamic XSTATE components")
      
      This picks these new prctls:
      
        $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/before
        $ cp arch/x86/include/uapi/asm/prctl.h tools/arch/x86/include/uapi/asm/prctl.h
        $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/after
        $ diff -u /tmp/before /tmp/after
        --- /tmp/before	2021-11-13 10:42:52.787308809 -0300
        +++ /tmp/after	2021-11-13 10:43:02.295558837 -0300
        @@ -6,6 +6,9 @@
         	[0x1004 - 0x1001]= "GET_GS",
         	[0x1011 - 0x1001]= "GET_CPUID",
         	[0x1012 - 0x1001]= "SET_CPUID",
        +	[0x1021 - 0x1001]= "GET_XCOMP_SUPP",
        +	[0x1022 - 0x1001]= "GET_XCOMP_PERM",
        +	[0x1023 - 0x1001]= "REQ_XCOMP_PERM",
         };
      
         #define x86_arch_prctl_codes_2_offset 0x2001
        $
      
      With this 'perf trace' can translate those numbers into strings and use
      the strings in filter expressions:
      
        # perf trace -e prctl
             0.000 ( 0.011 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9c014b7df5)     = 0
             0.032 ( 0.002 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9bb6b51580)     = 0
             5.452 ( 0.003 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfeb70) = 0
             5.468 ( 0.002 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfea70) = 0
            24.494 ( 0.009 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f562a32ae28) = 0
            24.540 ( 0.002 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f563c6d4b30) = 0
           670.281 ( 0.008 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30805c8) = 0
           670.293 ( 0.002 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30800f0) = 0
        ^C#
      
      This addresses these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/prctl.h' differs from latest version at 'arch/x86/include/uapi/asm/prctl.h'
        diff -u tools/arch/x86/include/uapi/asm/prctl.h arch/x86/include/uapi/asm/prctl.h
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/YY%2FER104k852WOTK@kernel.org/T/#u
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b749efe
    • Jiri Olsa's avatar
      perf tools: Add more weak libbpf functions · 2a4898fc
      Jiri Olsa authored
      
      
      We hit the window where perf uses libbpf functions, that did not make it
      to the official libbpf release yet and it's breaking perf build with
      dynamicly linked libbpf.
      
      Fixing this by providing the new interface as weak functions which calls
      the original libbpf functions. Fortunatelly the changes were just
      renames.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20211109140707.1689940-2-jolsa@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2a4898fc
    • Ian Rogers's avatar
      perf bpf: Avoid memory leak from perf_env__insert_btf() · 4924b1f7
      Ian Rogers authored
      
      
      perf_env__insert_btf() doesn't insert if a duplicate BTF id is
      encountered and this causes a memory leak. Modify the function to return
      a success/error value and then free the memory if insertion didn't
      happen.
      
      v2. Adds a return -1 when the insertion error occurs in
          perf_env__fetch_btf. This doesn't affect anything as the result is
          never checked.
      
      Fixes: 3792cb2f ("perf bpf: Save BTF in a rbtree in perf_env")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211112074525.121633-1-irogers@google.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4924b1f7
    • Ian Rogers's avatar
      perf symbols: Factor out annotation init/exit · 4f74f187
      Ian Rogers authored
      
      
      The exit function fixes a memory leak with the src field as detected by
      leak sanitizer. An example of which is:
      
      Indirect leak of 25133184 byte(s) in 207 object(s) allocated from:
          #0 0x7f199ecfe987 in __interceptor_calloc libsanitizer/asan/asan_malloc_linux.cpp:154
          #1 0x55defe638224 in annotated_source__alloc_histograms util/annotate.c:803
          #2 0x55defe6397e4 in symbol__hists util/annotate.c:952
          #3 0x55defe639908 in symbol__inc_addr_samples util/annotate.c:968
          #4 0x55defe63aa29 in hist_entry__inc_addr_samples util/annotate.c:1119
          #5 0x55defe499a79 in hist_iter__report_callback tools/perf/builtin-report.c:182
          #6 0x55defe7a859d in hist_entry_iter__add util/hist.c:1236
          #7 0x55defe49aa63 in process_sample_event tools/perf/builtin-report.c:315
          #8 0x55defe731bc8 in evlist__deliver_sample util/session.c:1473
          #9 0x55defe731e38 in machines__deliver_event util/session.c:1510
          #10 0x55defe732a23 in perf_session__deliver_event util/session.c:1590
          #11 0x55defe72951e in ordered_events__deliver_event util/session.c:183
          #12 0x55defe740082 in do_flush util/ordered-events.c:244
          #13 0x55defe7407cb in __ordered_events__flush util/ordered-events.c:323
          #14 0x55defe740a61 in ordered_events__flush util/ordered-events.c:341
          #15 0x55defe73837f in __perf_session__process_events util/session.c:2390
          #16 0x55defe7385ff in perf_session__process_events util/session.c:2420
          ...
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211112035124.94327-3-irogers@google.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4f74f187
    • Ian Rogers's avatar
      perf symbols: Bit pack to save a byte · 42704567
      Ian Rogers authored
      
      
      Use a bit field alongside the earlier bit fields.
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211112035124.94327-2-irogers@google.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      42704567
    • Ian Rogers's avatar
      perf symbols: Add documentation to 'struct symbol' · bd9acd9c
      Ian Rogers authored
      
      
      Refactor some existing comments and then infer the rest.
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211112035124.94327-1-irogers@google.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bd9acd9c
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync files changed by new futex_waitv syscall · 7380aa89
      Arnaldo Carvalho de Melo authored
      
      
      To pick the changes in these csets:
      
        039c0ec9 ("futex,x86: Wire up sys_futex_waitv()")
        bf69bad3 ("futex: Implement sys_futex_waitv()")
      
      That add support for this new syscall in tools such as 'perf trace'.
      
      For instance, this is now possible:
      
        # perf trace -e futex_waitv
        ^C#
        # perf trace -v -e futex_waitv
        Using CPUID AuthenticAMD-25-21-0
        event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
        mmap size 528384B
        ^C#
        # perf trace -v -e futex* --max-events 10
        Using CPUID AuthenticAMD-25-21-0
        event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 202 || id == 449)
        mmap size 528384B
                 ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
             0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
             0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
             0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
             0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
             0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
             0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
             0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
             0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
             0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
        #
      
      That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
      tracepoints.
      
        $ grep futex_waitv tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
        449	common	futex_waitv		sys_futex_waitv
        $
      
      This addresses these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
      
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7380aa89