Skip to content
  1. Dec 23, 2021
    • Jens Axboe's avatar
      io_uring: zero iocb->ki_pos for stream file types · 7b9762a5
      Jens Axboe authored
      io_uring supports using offset == -1 for using the current file position,
      and we read that in as part of read/write command setup. For the non-iter
      read/write types we pass in NULL for the position pointer, but for the
      iter types we should not be passing any anything but 0 for the position
      for a stream.
      
      Clear kiocb->ki_pos if the file is a stream, don't leave it as -1. If we
      do, then the request will error with -ESPIPE.
      
      Fixes: ba04291e
      
       ("io_uring: allow use of offset == -1 to mean file position")
      Link: https://github.com/axboe/liburing/discussions/501
      Reported-by: default avatarSamuel Williams <samuel.williams@oriontransfer.co.nz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7b9762a5
  2. Dec 14, 2021
    • Jens Axboe's avatar
      io-wq: drop wqe lock before creating new worker · d800c65c
      Jens Axboe authored
      
      
      We have two io-wq creation paths:
      
      - On queue enqueue
      - When a worker goes to sleep
      
      The latter invokes worker creation with the wqe->lock held, but that can
      run into problems if we end up exiting and need to cancel the queued work.
      syzbot caught this:
      
      ============================================
      WARNING: possible recursive locking detected
      5.16.0-rc4-syzkaller #0 Not tainted
      --------------------------------------------
      iou-wrk-6468/6471 is trying to acquire lock:
      ffff88801aa98018 (&wqe->lock){+.+.}-{2:2}, at: io_worker_cancel_cb+0xb7/0x210 fs/io-wq.c:187
      
      but task is already holding lock:
      ffff88801aa98018 (&wqe->lock){+.+.}-{2:2}, at: io_wq_worker_sleeping+0xb6/0x140 fs/io-wq.c:700
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&wqe->lock);
        lock(&wqe->lock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      1 lock held by iou-wrk-6468/6471:
       #0: ffff88801aa98018 (&wqe->lock){+.+.}-{2:2}, at: io_wq_worker_sleeping+0xb6/0x140 fs/io-wq.c:700
      
      stack backtrace:
      CPU: 1 PID: 6471 Comm: iou-wrk-6468 Not tainted 5.16.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106
       print_deadlock_bug kernel/locking/lockdep.c:2956 [inline]
       check_deadlock kernel/locking/lockdep.c:2999 [inline]
       validate_chain+0x5984/0x8240 kernel/locking/lockdep.c:3788
       __lock_acquire+0x1382/0x2b00 kernel/locking/lockdep.c:5027
       lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5637
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
       io_worker_cancel_cb+0xb7/0x210 fs/io-wq.c:187
       io_wq_cancel_tw_create fs/io-wq.c:1220 [inline]
       io_queue_worker_create+0x3cf/0x4c0 fs/io-wq.c:372
       io_wq_worker_sleeping+0xbe/0x140 fs/io-wq.c:701
       sched_submit_work kernel/sched/core.c:6295 [inline]
       schedule+0x67/0x1f0 kernel/sched/core.c:6323
       schedule_timeout+0xac/0x300 kernel/time/timer.c:1857
       wait_woken+0xca/0x1b0 kernel/sched/wait.c:460
       unix_msg_wait_data net/unix/unix_bpf.c:32 [inline]
       unix_bpf_recvmsg+0x7f9/0xe20 net/unix/unix_bpf.c:77
       unix_stream_recvmsg+0x214/0x2c0 net/unix/af_unix.c:2832
       sock_recvmsg_nosec net/socket.c:944 [inline]
       sock_recvmsg net/socket.c:962 [inline]
       sock_read_iter+0x3a7/0x4d0 net/socket.c:1035
       call_read_iter include/linux/fs.h:2156 [inline]
       io_iter_do_read fs/io_uring.c:3501 [inline]
       io_read fs/io_uring.c:3558 [inline]
       io_issue_sqe+0x144c/0x9590 fs/io_uring.c:6671
       io_wq_submit_work+0x2d8/0x790 fs/io_uring.c:6836
       io_worker_handle_work+0x808/0xdd0 fs/io-wq.c:574
       io_wqe_worker+0x395/0x870 fs/io-wq.c:630
       ret_from_fork+0x1f/0x30
      
      We can safely drop the lock before doing work creation, making the two
      contexts the same in that regard.
      
      Reported-by: default avatar <syzbot+b18b8be69df33a3918e9@syzkaller.appspotmail.com>
      Fixes: 71a85387
      
       ("io-wq: check for wq exit after adding new worker task_work")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d800c65c
  3. Dec 11, 2021
  4. Dec 07, 2021
  5. Dec 03, 2021
  6. Nov 27, 2021
    • Ye Bin's avatar
      io_uring: Fix undefined-behaviour in io_issue_sqe · f6223ff7
      Ye Bin authored
      
      
      We got issue as follows:
      ================================================================================
      UBSAN: Undefined behaviour in ./include/linux/ktime.h:42:14
      signed integer overflow:
      -4966321760114568020 * 1000000000 cannot be represented in type 'long long int'
      CPU: 1 PID: 2186 Comm: syz-executor.2 Not tainted 4.19.90+ #12
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78
       show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x170/0x1dc lib/dump_stack.c:118
       ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161
       handle_overflow+0x188/0x1dc lib/ubsan.c:192
       __ubsan_handle_mul_overflow+0x34/0x44 lib/ubsan.c:213
       ktime_set include/linux/ktime.h:42 [inline]
       timespec64_to_ktime include/linux/ktime.h:78 [inline]
       io_timeout fs/io_uring.c:5153 [inline]
       io_issue_sqe+0x42c8/0x4550 fs/io_uring.c:5599
       __io_queue_sqe+0x1b0/0xbc0 fs/io_uring.c:5988
       io_queue_sqe+0x1ac/0x248 fs/io_uring.c:6067
       io_submit_sqe fs/io_uring.c:6137 [inline]
       io_submit_sqes+0xed8/0x1c88 fs/io_uring.c:6331
       __do_sys_io_uring_enter fs/io_uring.c:8170 [inline]
       __se_sys_io_uring_enter fs/io_uring.c:8129 [inline]
       __arm64_sys_io_uring_enter+0x490/0x980 fs/io_uring.c:8129
       invoke_syscall arch/arm64/kernel/syscall.c:53 [inline]
       el0_svc_common+0x374/0x570 arch/arm64/kernel/syscall.c:121
       el0_svc_handler+0x190/0x260 arch/arm64/kernel/syscall.c:190
       el0_svc+0x10/0x218 arch/arm64/kernel/entry.S:1017
      ================================================================================
      
      As ktime_set only judge 'secs' if big than KTIME_SEC_MAX, but if we pass
      negative value maybe lead to overflow.
      To address this issue, we must check if 'sec' is negative.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211118015907.844807-1-yebin10@huawei.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f6223ff7
    • Ye Bin's avatar
      io_uring: fix soft lockup when call __io_remove_buffers · 1d0254e6
      Ye Bin authored
      I got issue as follows:
      [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      [  594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  594.364987] Modules linked in:
      [  594.365405] irq event stamp: 604180238
      [  594.365906] hardirqs last  enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50
      [  594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0
      [  594.368420] softirqs last  enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e
      [  594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250
      [  594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G            L    5.15.0-next-20211112+ #88
      [  594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      [  594.373604] Workqueue: events_unbound io_ring_exit_work
      [  594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50
      [  594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474
      [  594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202
      [  594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106
      [  594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001
      [  594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab
      [  594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0
      [  594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000
      [  594.382787] FS:  0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000
      [  594.383851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0
      [  594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  594.387403] Call Trace:
      [  594.387738]  <TASK>
      [  594.388042]  find_and_remove_object+0x118/0x160
      [  594.389321]  delete_object_full+0xc/0x20
      [  594.389852]  kfree+0x193/0x470
      [  594.390275]  __io_remove_buffers.part.0+0xed/0x147
      [  594.390931]  io_ring_ctx_free+0x342/0x6a2
      [  594.392159]  io_ring_exit_work+0x41e/0x486
      [  594.396419]  process_one_work+0x906/0x15a0
      [  594.399185]  worker_thread+0x8b/0xd80
      [  594.400259]  kthread+0x3bf/0x4a0
      [  594.401847]  ret_from_fork+0x22/0x30
      [  594.402343]  </TASK>
      
      Message from syslogd@localhost at Nov 13 09:09:54 ...
      kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      
      We can reproduce this issue by follow syzkaller log:
      r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0)
      sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0)
      syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0)
      io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0)
      
      The reason above issue  is 'buf->list' has 2,100,000 nodes, occupied cpu lead
      to soft lockup.
      To solve this issue, we need add schedule point when do while loop in
      '__io_remove_buffers'.
      After add  schedule point we do regression, get follow data.
      [  240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      [  305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00
      [  333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      ...
      
      Fixes:8bab4c09
      
      ("io_uring: allow conditional reschedule for intensive iterators")
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211122024737.2198530-1-yebin10@huawei.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1d0254e6
  7. Nov 26, 2021
    • Pavel Begunkov's avatar
      io_uring: fix link traversal locking · 6af3f48b
      Pavel Begunkov authored
      WARNING: inconsistent lock state
      5.16.0-rc2-syzkaller #0 Not tainted
      inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      ffff888078e11418 (&ctx->timeout_lock
      ){?.+.}-{2:2}
      , at: io_timeout_fn+0x6f/0x360 fs/io_uring.c:5943
      {HARDIRQ-ON-W} state was registered at:
        [...]
        spin_unlock_irq include/linux/spinlock.h:399 [inline]
        __io_poll_remove_one fs/io_uring.c:5669 [inline]
        __io_poll_remove_one fs/io_uring.c:5654 [inline]
        io_poll_remove_one+0x236/0x870 fs/io_uring.c:5680
        io_poll_remove_all+0x1af/0x235 fs/io_uring.c:5709
        io_ring_ctx_wait_and_kill+0x1cc/0x322 fs/io_uring.c:9534
        io_uring_release+0x42/0x46 fs/io_uring.c:9554
        __fput+0x286/0x9f0 fs/file_table.c:280
        task_work_run+0xdd/0x1a0 kernel/task_work.c:164
        exit_task_work include/linux/task_work.h:32 [inline]
        do_exit+0xc14/0x2b40 kernel/exit.c:832
      
      674ee8e1
      
       ("io_uring: correct link-list traversal locking") fixed a
      data race but introduced a possible deadlock and inconsistentcy in irq
      states. E.g.
      
      io_poll_remove_all()
          spin_lock_irq(timeout_lock)
          io_poll_remove_one()
              spin_lock/unlock_irq(poll_lock);
          spin_unlock_irq(timeout_lock)
      
      Another type of problem is freeing a request while holding
      ->timeout_lock, which may leads to a deadlock in
      io_commit_cqring() -> io_flush_timeouts() and other places.
      
      Having 3 nested locks is also too ugly. Add io_match_task_safe(), which
      would briefly take and release timeout_lock for race prevention inside,
      so the actuall request cancellation / free / etc. code doesn't have it
      taken.
      
      Reported-by: default avatar <syzbot+ff49a3059d49b0ca0eec@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+847f02ec20a6609a328b@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+3368aadcd30425ceb53b@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+51ce8887cdef77c9ac83@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+3cb756a49d2f394a9ee3@syzkaller.appspotmail.com>
      Fixes: 674ee8e1
      
       ("io_uring: correct link-list traversal locking")
      Cc: stable@kernel.org # 5.15+
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Link: https://lore.kernel.org/r/397f7ebf3f4171f1abe41f708ac1ecb5766f0b68.1637937097.git.asml.silence@gmail.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6af3f48b
    • Pavel Begunkov's avatar
      io_uring: fail cancellation for EXITING tasks · 617a8948
      Pavel Begunkov authored
      WARNING: CPU: 1 PID: 20 at fs/io_uring.c:6269 io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
      CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 5.16.0-rc1-syzkaller #0
      Workqueue: events io_fallback_req_func
      RIP: 0010:io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
      Call Trace:
       <TASK>
       io_req_task_link_timeout+0x6b/0x1e0 fs/io_uring.c:6886
       io_fallback_req_func+0xf9/0x1ae fs/io_uring.c:1334
       process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
       worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
       kthread+0x405/0x4f0 kernel/kthread.c:327
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      
      We need original task's context to do cancellations, so if it's dying
      and the callback is executed in a fallback mode, fail the cancellation
      attempt.
      
      Fixes: 89b263f6
      
       ("io_uring: run linked timeouts from task_work")
      Cc: stable@kernel.org # 5.15+
      Reported-by: default avatar <syzbot+ab0cfe96c2b3cd1c1153@syzkaller.appspotmail.com>
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Link: https://lore.kernel.org/r/4c41c5f379c6941ad5a07cd48cb66ed62199cf7e.1637937097.git.asml.silence@gmail.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      617a8948
  8. Nov 23, 2021
  9. Nov 17, 2021
  10. Nov 15, 2021
    • Linus Torvalds's avatar
      Linux 5.16-rc1 · fa55b7dc
      Linus Torvalds authored
      v5.16-rc1
      fa55b7dc
    • Gustavo A. R. Silva's avatar
      kconfig: Add support for -Wimplicit-fallthrough · dee2b702
      Gustavo A. R. Silva authored
      
      
      Add Kconfig support for -Wimplicit-fallthrough for both GCC and Clang.
      
      The compiler option is under configuration CC_IMPLICIT_FALLTHROUGH,
      which is enabled by default.
      
      Special thanks to Nathan Chancellor who fixed the Clang bug[1][2]. This
      bugfix only appears in Clang 14.0.0, so older versions still contain
      the bug and -Wimplicit-fallthrough won't be enabled for them, for now.
      
      This concludes a long journey and now we are finally getting rid
      of the unintentional fallthrough bug-class in the kernel, entirely. :)
      
      Link: https://github.com/llvm/llvm-project/commit/9ed4a94d6451046a51ef393cd62f00710820a7e8 [1]
      Link: https://bugs.llvm.org/show_bug.cgi?id=51094 [2]
      Link: https://github.com/KSPP/linux/issues/115
      Link: https://github.com/ClangBuiltLinux/linux/issues/236
      Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Co-developed-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dee2b702
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · ce49bfc8
      Linus Torvalds authored
      Pull xfs cleanups from Darrick Wong:
       "The most 'exciting' aspect of this branch is that the xfsprogs
        maintainer and I have worked through the last of the code
        discrepancies between kernel and userspace libxfs such that there are
        no code differences between the two except for #includes.
      
        IOWs, diff suffices to demonstrate that the userspace tools behave the
        same as the kernel, and kernel-only bits are clearly marked in the
        /kernel/ source code instead of just the userspace source.
      
        Summary:
      
         - Clean up open-coded swap() calls.
      
         - A little bit of #ifdef golf to complete the reunification of the
           kernel and userspace libxfs source code"
      
      * tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: sync xfs_btree_split macros with userspace libxfs
        xfs: #ifdef out perag code for userspace
        xfs: use swap() to make dabtree code cleaner
      ce49bfc8
    • Linus Torvalds's avatar
      Merge tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · c3b68c27
      Linus Torvalds authored
      Pull more parisc fixes from Helge Deller:
       "Fix a build error in stracktrace.c, fix resolving of addresses to
        function names in backtraces, fix single-stepping in assembly code and
        flush userspace pte's when using set_pte_at()"
      
      * tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc/entry: fix trace test in syscall exit path
        parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
        parisc: Fix implicit declaration of function '__kernel_text_address'
        parisc: Fix backtrace to always include init funtion names
      c3b68c27
    • Linus Torvalds's avatar
      Merge tag 'sh-for-5.16' of git://git.libc.org/linux-sh · 24318ae8
      Linus Torvalds authored
      Pull arch/sh updates from Rich Felker.
      
      * tag 'sh-for-5.16' of git://git.libc.org/linux-sh:
        sh: pgtable-3level: Fix cast to pointer from integer of different size
        sh: fix READ/WRITE redefinition warnings
        sh: define __BIG_ENDIAN for math-emu
        sh: math-emu: drop unused functions
        sh: fix kconfig unmet dependency warning for FRAME_POINTER
        sh: Cleanup about SPARSE_IRQ
        sh: kdump: add some attribute to function
        maple: fix wrong return value of maple_bus_init().
        sh: boot: avoid unneeded rebuilds under arch/sh/boot/compressed/
        sh: boot: add intermediate vmlinux.bin* to targets instead of extra-y
        sh: boards: Fix the cacography in irq.c
        sh: check return code of request_irq
        sh: fix trivial misannotations
      24318ae8
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 6ea45c57
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - Fix early_iounmap
      
       - Drop cc-option fallbacks for architecture selection
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9156/1: drop cc-option fallbacks for architecture selection
        ARM: 9155/1: fix early early_iounmap()
      6ea45c57
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 0d1503d8
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Two fixes due to DT node name changes on Arm, Ltd. boards
      
       - Treewide rename of Ingenic CGU headers
      
       - Update ST email addresses
      
       - Remove Netlogic DT bindings
      
       - Dropping few more cases of redundant 'maxItems' in schemas
      
       - Convert toshiba,tc358767 bridge binding to schema
      
      * tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: watchdog: sunxi: fix error in schema
        bindings: media: venus: Drop redundant maxItems for power-domain-names
        dt-bindings: Remove Netlogic bindings
        clk: versatile: clk-icst: Ensure clock names are unique
        of: Support using 'mask' in making device bus id
        dt-bindings: treewide: Update @st.com email address to @foss.st.com
        dt-bindings: media: Update maintainers for st,stm32-hwspinlock.yaml
        dt-bindings: media: Update maintainers for st,stm32-cec.yaml
        dt-bindings: mfd: timers: Update maintainers for st,stm32-timers
        dt-bindings: timer: Update maintainers for st,stm32-timer
        dt-bindings: i2c: imx: hardware do not restrict clock-frequency to only 100 and 400 kHz
        dt-bindings: display: bridge: Convert toshiba,tc358767.txt to yaml
        dt-bindings: Rename Ingenic CGU headers to ingenic,*.h
      0d1503d8
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 622c72b6
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix for POSIX CPU timers to address a problem where POSIX CPU
        timer delivery stops working for a new child task because
        copy_process() copies state information which is only valid for the
        parent task"
      
      * tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
      622c72b6
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c36e33e2
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of fixes for the interrupt subsystem
      
        Core code:
      
         - A regression fix for the Open Firmware interrupt mapping code where
           a interrupt controller property in a node caused a map property in
           the same node to be ignored.
      
        Interrupt chip drivers:
      
         - Workaround a limitation in SiFive PLIC interrupt chip which
           silently ignores an EOI when the interrupt line is masked.
      
         - Provide the missing mask/unmask implementation for the CSKY MP
           interrupt controller.
      
        PCI/MSI:
      
         - Prevent a use after free when PCI/MSI interrupts are released by
           destroying the sysfs entries before freeing the memory which is
           accessed in the sysfs show() function.
      
         - Implement a mask quirk for the Nvidia ION AHCI chip which does not
           advertise masking capability despite implementing it. Even worse
           the chip comes out of reset with all MSI entries masked, which due
           to the missing masking capability never get unmasked.
      
         - Move the check which prevents accessing the MSI[X] masking for XEN
           back into the low level accessors. The recent consolidation missed
           that these accessors can be invoked from places which do not have
           that check which broke XEN. Move them back to he original place
           instead of sprinkling tons of these checks all over the code"
      
      * tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        of/irq: Don't ignore interrupt-controller when interrupt-map failed
        irqchip/sifive-plic: Fixup EOI failed when masked
        irqchip/csky-mpintc: Fixup mask/unmask implementation
        PCI/MSI: Destroy sysfs before freeing entries
        PCI: Add MSI masking quirk for Nvidia ION AHCI
        PCI/MSI: Deal with devices lying about their MSI mask capability
        PCI/MSI: Move non-mask check back into low level accessors
      c36e33e2
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 218cc8b8
      Linus Torvalds authored
      Pull x86 static call update from Thomas Gleixner:
       "A single fix for static calls to make the trampoline patching more
        robust by placing explicit signature bytes after the call trampoline
        to prevent patching random other jumps like the CFI jump table
        entries"
      
      * tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        static_call,x86: Robustify trampoline patching
      218cc8b8
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fc661f2d
      Linus Torvalds authored
      Pull scheduler fixes from Borislav Petkov:
      
       - Avoid touching ~100 config files in order to be able to select the
         preemption model
      
       - clear cluster CPU masks too, on the CPU unplug path
      
       - prevent use-after-free in cfs
      
       - Prevent a race condition when updating CPU cache domains
      
       - Factor out common shared part of smp_prepare_cpus() into a common
         helper which can be called by both baremetal and Xen, in order to fix
         a booting of Xen PV guests
      
      * tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        preempt: Restore preemption model selection configs
        arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology()
        sched/fair: Prevent dead task groups from regaining cfs_rq's
        sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
        x86/smp: Factor out parts of native_smp_prepare_cpus()
      fc661f2d
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f7018be2
      Linus Torvalds authored
      Pull perf fixes from Borislav Petkov:
      
       - Prevent unintentional page sharing by checking whether a page
         reference to a PMU samples page has been acquired properly before
         that
      
       - Make sure the LBR_SELECT MSR is saved/restored too
      
       - Reset the LBR_SELECT MSR when resetting the LBR PMU to clear any
         residual data left
      
      * tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Avoid put_page() when GUP fails
        perf/x86/vlbr: Add c->flags to vlbr event constraints
        perf/x86/lbr: Reset LBR_SELECT during vlbr reset
      f7018be2
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1654e95e
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
      
       - Add the model number of a new, Raptor Lake CPU, to intel-family.h
      
       - Do not log spurious corrected MCEs on SKL too, due to an erratum
      
       - Clarify the path of paravirt ops patches upstream
      
       - Add an optimization to avoid writing out AMX components to sigframes
         when former are in init state
      
      * tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add Raptor Lake to Intel family
        x86/mce: Add errata workaround for Skylake SKX37
        MAINTAINERS: Add some information to PARAVIRT_OPS entry
        x86/fpu: Optimize out sigframe xfeatures when in init state
      1654e95e
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.16-2021-11-13' of... · 35c8fad4
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
       "Hardware tracing:
      
         - ARM:
            * Print the size of the buffer size consistently in hexadecimal in
              ARM Coresight.
            * Add Coresight snapshot mode support.
            * Update --switch-events docs in 'perf record'.
            * Support hardware-based PID tracing.
            * Track task context switch for cpu-mode events.
      
         - Vendor events:
            * Add metric events JSON file for power10 platform
      
        perf test:
      
         - Get 'perf test' unit tests closer to kunit.
      
         - Topology tests improvements.
      
         - Remove bashisms from some tests.
      
        perf bench:
      
         - Fix memory leak of perf_cpu_map__new() in the futex benchmarks.
      
        libbpf:
      
         - Add some more weak libbpf functions o allow building with the
           libbpf versions, old ones, present in distros.
      
        libbeauty:
      
         - Translate [gs]setsockopt 'level' argument integer values to
           strings.
      
        tools headers UAPI:
      
         - Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files
           with the kernel sources.
      
        Documentation:
      
         - Add documentation to 'struct symbol'.
      
         - Synchronize the definition of enum perf_hw_id with code in
           tools/perf/design.txt"
      
      * tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
        perf tests: Remove bash constructs from stat_all_pmu.sh
        perf tests: Remove bash construct from record+zstd_comp_decomp.sh
        perf test: Remove bash construct from stat_bpf_counters.sh test
        perf bench futex: Fix memory leak of perf_cpu_map__new()
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
        tools headers UAPI: Sync sound/asound.h with the kernel sources
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools headers UAPI: Sync arch prctl headers with the kernel sources
        perf tools: Add more weak libbpf functions
        perf bpf: Avoid memory leak from perf_env__insert_btf()
        perf symbols: Factor out annotation init/exit
        perf symbols: Bit pack to save a byte
        perf symbols: Add documentation to 'struct symbol'
        tools headers UAPI: Sync files changed by new futex_waitv syscall
        perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning
        perf arm-spe: Support hardware-based PID tracing
        perf arm-spe: Save context ID in record
        perf arm-spe: Update --switch-events docs in 'perf record'
        perf arm-spe: Track task context switch for cpu-mode events
        ...
      35c8fad4
  11. Nov 14, 2021
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.16-1' of... · 979292af
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
        - Address an issue with the SiFive PLIC being unable to EOI
          a masked interrupt
      
        - Move the disable/enable methods in the CSky mpintc to
          mask/unmask
      
        - Fix a regression in the OF irq code where an interrupt-controller
          property in the same node as an interrupt-map property would get
          ignored
      
      Link: https://lore.kernel.org/all/20211112173459.4015233-1-maz@kernel.org
      979292af
    • Linus Torvalds's avatar
      Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux · c8c10954
      Linus Torvalds authored
      Pull zstd update from Nick Terrell:
       "Update to zstd-1.4.10.
      
        Add myself as the maintainer of zstd and update the zstd version in
        the kernel, which is now 4 years out of date, to a much more recent
        zstd release. This includes bug fixes, much more extensive fuzzing,
        and performance improvements. And generates the kernel zstd
        automatically from upstream zstd, so it is easier to keep the zstd
        verison up to date, and we don't fall so far out of date again.
      
        This includes 5 commits that update the zstd library version:
      
         - Adds a new kernel-style wrapper around zstd.
      
           This wrapper API is functionally equivalent to the subset of the
           current zstd API that is currently used. The wrapper API changes to
           be kernel style so that the symbols don't collide with zstd's
           symbols. The update to zstd-1.4.10 maintains the same API and
           preserves the semantics, so that none of the callers need to be
           updated. All callers are updated in the commit, because there are
           zero functional changes.
      
         - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
           depend on the layout of `lib/zstd/` to include every source file.
           This allows the next patch to be automatically generated.
      
         - Imports the zstd-1.4.10 source code. This commit is automatically
           generated from upstream zstd (https://github.com/facebook/zstd).
      
         - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
      
         - Fixes a newly added build warning for clang.
      
        The discussion around this patchset has been pretty long, so I've
        included a FAQ-style summary of the history of the patchset, and why
        we are taking this approach.
      
        Why do we need to update?
        -------------------------
      
        The zstd version in the kernel is based off of zstd-1.3.1, which is
        was released August 20, 2017. Since then zstd has seen many bug fixes
        and performance improvements. And, importantly, upstream zstd is
        continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
        older versions. So the only way to sanely get these fixes is to keep
        up to date with upstream zstd.
      
        There are no known security issues that affect the kernel, but we need
        to be able to update in case there are. And while there are no known
        security issues, there are relevant bug fixes. For example the problem
        with large kernel decompression has been fixed upstream for over 2
        years [1]
      
        Additionally the performance improvements for kernel use cases are
        significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
      
         - BtrFS zstd compression at levels 1 and 3 is 5% faster
      
         - BtrFS zstd decompression+read is 15% faster
      
         - SquashFS zstd decompression+read is 15% faster
      
         - F2FS zstd compression+write at level 3 is 8% faster
      
         - F2FS zstd decompression+read is 20% faster
      
         - ZRAM decompression+read is 30% faster
      
         - Kernel zstd decompression is 35% faster
      
         - Initramfs zstd decompression+build is 5% faster
      
        On top of this, there are significant performance improvements coming
        down the line in the next zstd release, and the new automated update
        patch generation will allow us to pull them easily.
      
        How is the update patch generated?
        ----------------------------------
      
        The first two patches are preparation for updating the zstd version.
        Then the 3rd patch in the series imports upstream zstd into the
        kernel. This patch is automatically generated from upstream. A script
        makes the necessary changes and imports it into the kernel. The
        changes are:
      
         - Replace all libc dependencies with kernel replacements and rewrite
           includes.
      
         - Remove unncessary portability macros like: #if defined(_MSC_VER).
      
         - Use the kernel xxhash instead of bundling it.
      
        This automation gets tested every commit by upstream's continuous
        integration. When we cut a new zstd release, we will submit a patch to
        the kernel to update the zstd version in the kernel.
      
        The automated process makes it easy to keep the kernel version of zstd
        up to date. The current zstd in the kernel shares the guts of the
        code, but has a lot of API and minor changes to work in the kernel.
        This is because at the time upstream zstd was not ready to be used in
        the kernel envrionment as-is. But, since then upstream zstd has
        evolved to support being used in the kernel as-is.
      
        Why are we updating in one big patch?
        -------------------------------------
      
        The 3rd patch in the series is very large. This is because it is
        restructuring the code, so it both deletes the existing zstd, and
        re-adds the new structure. Future updates will be directly
        proportional to the changes in upstream zstd since the last import.
        They will admittidly be large, as zstd is an actively developed
        project, and has hundreds of commits between every release. However,
        there is no other great alternative.
      
        One option ruled out is to replay every upstream zstd commit. This is
        not feasible for several reasons:
      
         - There are over 3500 upstream commits since the zstd version in the
           kernel.
      
         - The automation to automatically generate the kernel update was only
           added recently, so older commits cannot easily be imported.
      
         - Not every upstream zstd commit builds.
      
         - Only zstd releases are "supported", and individual commits may have
           bugs that were fixed before a release.
      
        Another option to reduce the patch size would be to first reorganize
        to the new file structure, and then apply the patch. However, the
        current kernel zstd is formatted with clang-format to be more
        "kernel-like". But, the new method imports zstd as-is, without
        additional formatting, to allow for closer correlation with upstream,
        and easier debugging. So the patch wouldn't be any smaller.
      
        It also doesn't make sense to import upstream zstd commit by commit
        going forward. Upstream zstd doesn't support production use cases
        running of the development branch. We have a lot of post-commit
        fuzzing that catches many bugs, so indiviudal commits may be buggy,
        but fixed before a release. So going forward, I intend to import every
        (important) zstd release into the Kernel.
      
        So, while it isn't ideal, updating in one big patch is the only patch
        I see forward.
      
        Who is responsible for this code?
        ---------------------------------
      
        I am. This patchset adds me as the maintainer for zstd. Previously,
        there was no tree for zstd patches. Because of that, there were
        several patches that either got ignored, or took a long time to merge,
        since it wasn't clear which tree should pick them up. I'm officially
        stepping up as maintainer, and setting up my tree as the path through
        which zstd patches get merged. I'll make sure that patches to the
        kernel zstd get ported upstream, so they aren't erased when the next
        version update happens.
      
        How is this code tested?
        ------------------------
      
        I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
        Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
        aarch64. I checked both performance and correctness.
      
        Also, thanks to many people in the community who have tested these
        patches locally.
      
        Lastly, this code will bake in linux-next before being merged into
        v5.16.
      
        Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
        ------------------------------------------------------------
      
        This patchset has been outstanding since 2020, and zstd-1.4.10 was the
        latest release when it was created. Since the update patch is
        automatically generated from upstream, I could generate it from
        zstd-1.5.0.
      
        However, there were some large stack usage regressions in zstd-1.5.0,
        and are only fixed in the latest development branch. And the latest
        development branch contains some new code that needs to bake in the
        fuzzer before I would feel comfortable releasing to the kernel.
      
        Once this patchset has been merged, and we've released zstd-1.5.1, we
        can update the kernel to zstd-1.5.1, and exercise the update process.
      
        You may notice that zstd-1.4.10 doesn't exist upstream. This release
        is an artifical release based off of zstd-1.4.9, with some fixes for
        the kernel backported from the development branch. I will tag the
        zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
        is running a known version of zstd that can be debugged upstream.
      
        Why was a wrapper API added?
        ----------------------------
      
        The first versions of this patchset migrated the kernel to the
        upstream zstd API. It first added a shim API that supported the new
        upstream API with the old code, then updated callers to use the new
        shim API, then transitioned to the new code and deleted the shim API.
        However, Cristoph Hellwig suggested that we transition to a kernel
        style API, and hide zstd's upstream API behind that. This is because
        zstd's upstream API is supports many other use cases, and does not
        follow the kernel style guide, while the kernel API is focused on the
        kernel's use cases, and follows the kernel style guide.
      
        Where is the previous discussion?
        ---------------------------------
      
        Links for the discussions of the previous versions of the patch set
        below. The largest changes in the design of the patchset are driven by
        the discussions in v11, v5, and v1. Sorry for the mix of links, I
        couldn't find most of the the threads on lkml.org"
      
      Link: https://lkml.org/lkml/2020/9/29/27 [1]
      Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
      Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
      Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
      Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
      Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272a
      
      -1-nickrterrell@gmail.com/ [v8]
      Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
      Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
      Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
      Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
      Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
      Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
      Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1]
      Signed-off-by: default avatarNick Terrell <terrelln@fb.com>
      Tested By: Paul Jones <paul@pauljones.id.au>
      Tested-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
      Tested-by: default avatarJean-Denis Girard <jd.girard@sysnux.pf>
      
      * tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
        lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
        MAINTAINERS: Add maintainer entry for zstd
        lib: zstd: Upgrade to latest upstream zstd version 1.4.10
        lib: zstd: Add decompress_sources.h for decompress_unzstd
        lib: zstd: Add kernel-specific API
      c8c10954
    • Linus Torvalds's avatar
      Merge tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux · ccfff0a2
      Linus Torvalds authored
      Pull virtio-mem update from David Hildenbrand:
       "Support the VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature in virtio-mem,
        now that "accidential" access to logically unplugged memory inside
        added Linux memory blocks is no longer possible, because we:
      
         - Removed /dev/kmem in commit bbcd53c9 ("drivers/char: remove
           /dev/kmem for good")
      
         - Disallowed access to virtio-mem device memory via /dev/mem in
           commit 2128f4e2 ("virtio-mem: disallow mapping virtio-mem memory
           via /dev/mem")
      
         - Sanitized access to virtio-mem device memory via /proc/kcore in
           commit 0daa322b ("fs/proc/kcore: don't read offline sections,
           logically offline pages and hwpoisoned pages")
      
         - Sanitized access to virtio-mem device memory via /proc/vmcore in
           commit ce281462 ("virtio-mem: kdump mode to sanitize
           /proc/vmcore access")
      
        The new VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE feature that will be
        required by some hypervisors implementing virtio-mem in the near
        future, so let's support it now that we safely can"
      
      * tag 'virtio-mem-for-5.16' of git://github.com/davidhildenbrand/linux:
        virtio-mem: support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
      ccfff0a2
    • James Clark's avatar
      perf tests: Remove bash constructs from stat_all_pmu.sh · ac96f463
      James Clark authored
      
      
      The tests were passing but without testing and were printing the
      following:
      
        $ ./perf test -v 90
        90: perf all PMU test                                               :
        --- start ---
        test child forked, pid 51650
        Testing cpu/branch-instructions/
        ./tests/shell/stat_all_pmu.sh: 10: [:
         Performance counter stats for 'true':
      
                   137,307      cpu/branch-instructions/
      
               0.001686672 seconds time elapsed
      
               0.001376000 seconds user
               0.000000000 seconds sys: unexpected operator
      
      Changing the regexes to a grep works in sh and prints this:
      
        $ ./perf test -v 90
        90: perf all PMU test                                               :
        --- start ---
        test child forked, pid 60186
        [...]
        Testing tlb_flush.stlb_any
        test child finished with 0
        ---- end ----
        perf all PMU test: Ok
      
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-4-james.clark@arm.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ac96f463
    • James Clark's avatar
      perf tests: Remove bash construct from record+zstd_comp_decomp.sh · a9cdc1c5
      James Clark authored
      Commit 463538a3 ("perf tests: Fix test 68 zstd compression for
      s390") inadvertently removed the -g flag from all platforms rather than
      just s390, because the [[ ]] construct fails in sh. Changing to single
      brackets restores testing of call graphs and removes the following error
      from the output:
      
        $ ./perf test -v 85
        85: Zstd perf.data compression/decompression                        :
        --- start ---
        test child forked, pid 50643
        Collecting compressed record file:
        ./tests/shell/record+zstd_comp_decomp.sh: 15: [[: not found
      
      Fixes: 463538a3
      
       ("perf tests: Fix test 68 zstd compression for s390")
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-3-james.clark@arm.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a9cdc1c5
    • James Clark's avatar
      perf test: Remove bash construct from stat_bpf_counters.sh test · c8b94764
      James Clark authored
      
      
      Currently the test skips with an error because == only works in bash:
      
        $ ./perf test 91 -v
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        91: perf stat --bpf-counters test                                   :
        --- start ---
        test child forked, pid 44586
        ./tests/shell/stat_bpf_counters.sh: 26: [: -v: unexpected operator
        test child finished with -2
        ---- end ----
        perf stat --bpf-counters test: Skip
      
      Changing == to = does the same thing, but doesn't result in an error:
      
        ./perf test 91 -v
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        91: perf stat --bpf-counters test                                   :
        --- start ---
        test child forked, pid 45833
        Skipping: --bpf-counters not supported
          Error: unknown option `bpf-counters'
        [...]
        test child finished with -2
        ---- end ----
        perf stat --bpf-counters test: Skip
      
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20211028134828.65774-2-james.clark@arm.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c8b94764
    • Sohaib Mohamed's avatar
      perf bench futex: Fix memory leak of perf_cpu_map__new() · 88e48238
      Sohaib Mohamed authored
      ASan reports memory leaks while running:
      
        $ sudo ./perf bench futex all
      
      The leaks are caused by perf_cpu_map__new not being freed.
      This patch adds the missing perf_cpu_map__put since it calls
      cpu_map_delete implicitly.
      
      Fixes: 9c3516d1
      
       ("libperf: Add perf_cpu_map__new()/perf_cpu_map__read() functions")
      Signed-off-by: default avatarSohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lore.kernel.org/lkml/20211112201134.77892-1-sohaib.amhmd@gmail.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88e48238
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync the msr-index.h copy with the kernel sources · 3442b5e0
      Arnaldo Carvalho de Melo authored
      To pick up the changes in:
      
        dae1bd58
      
       ("x86/msr-index: Add MSRs for XFD")
      
      Addressing these tools/perf build warnings:
      
          diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
          Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
      
      That makes the beautification scripts to pick some new entries:
      
        $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
        --- tools/arch/x86/include/asm/msr-index.h	2021-07-15 16:17:01.819817827 -0300
        +++ arch/x86/include/asm/msr-index.h	2021-11-06 15:49:33.738517311 -0300
        @@ -625,6 +625,8 @@
      
         #define MSR_IA32_BNDCFGS_RSVD		0x00000ffc
      
        +#define MSR_IA32_XFD			0x000001c4
        +#define MSR_IA32_XFD_ERR		0x000001c5
         #define MSR_IA32_XSS			0x00000da0
      
         #define MSR_IA32_APICBASE		0x0000001b
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > /tmp/after
        $ diff -u /tmp/before /tmp/after
        --- /tmp/before	2021-11-13 11:10:39.964201505 -0300
        +++ /tmp/after	2021-11-13 11:10:47.902410873 -0300
        @@ -93,6 +93,8 @@
         	[0x000001b0] = "IA32_ENERGY_PERF_BIAS",
         	[0x000001b1] = "IA32_PACKAGE_THERM_STATUS",
         	[0x000001b2] = "IA32_PACKAGE_THERM_INTERRUPT",
        +	[0x000001c4] = "IA32_XFD",
        +	[0x000001c5] = "IA32_XFD_ERR",
         	[0x000001c8] = "LBR_SELECT",
         	[0x000001c9] = "LBR_TOS",
         	[0x000001d9] = "IA32_DEBUGCTLMSR",
        $
      
      And this gets rebuilt:
      
        CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
        INSTALL  trace_plugins
        LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
        LD       /tmp/build/perf/trace/beauty/perf-in.o
        LD       /tmp/build/perf/perf-in.o
        LINK     /tmp/build/perf/perf
      
      Now one can trace systemwide asking to see backtraces to where those
      MSRs are being read/written with:
      
        # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
        ^C#
        #
      
      If we use -v (verbose mode) we can see what it does behind the scenes:
      
        # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_XFD || msr==IA32_XFD_ERR"
        <SNIP>
        New filter for msr:read_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
        New filter for msr:write_msr: (msr==0x1c4 || msr==0x1c5) && (common_pid != 4448951 && common_pid != 8781)
        <SNIP>
        ^C#
      
      Example with a frequent msr:
      
        # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
        Using CPUID AuthenticAMD-25-21-0
        0x48
        New filter for msr:read_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
        0x48
        New filter for msr:write_msr: (msr==0x48) && (common_pid != 3738351 && common_pid != 3564)
        mmap size 528384B
        Looking at the vmlinux_path (8 entries long)
        symsrc__init: build id mismatch for vmlinux.
        Using /proc/kcore for kernel data
        Using /proc/kallsyms for symbols
             0.000 pipewire/2479 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_trace_write_msr ([kernel.kallsyms])
                                               __switch_to_xtra ([kernel.kallsyms])
                                               __switch_to ([kernel.kallsyms])
                                               __schedule ([kernel.kallsyms])
                                               schedule ([kernel.kallsyms])
                                               schedule_hrtimeout_range_clock ([kernel.kallsyms])
                                               do_epoll_wait ([kernel.kallsyms])
                                               __x64_sys_epoll_wait ([kernel.kallsyms])
                                               do_syscall_64 ([kernel.kallsyms])
                                               entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                               epoll_wait (/usr/lib64/libc-2.33.so)
                                               [0x76c4] (/usr/lib64/spa-0.2/support/libspa-support.so)
                                               [0x4cf0] (/usr/lib64/spa-0.2/support/libspa-support.so)
             0.027 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_trace_write_msr ([kernel.kallsyms])
                                               __switch_to_xtra ([kernel.kallsyms])
                                               __switch_to ([kernel.kallsyms])
                                               __schedule ([kernel.kallsyms])
                                               schedule_idle ([kernel.kallsyms])
                                               do_idle ([kernel.kallsyms])
                                               cpu_startup_entry ([kernel.kallsyms])
                                               start_kernel ([kernel.kallsyms])
                                               secondary_startup_64_no_verify ([kernel.kallsyms])
        #
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/YY%2FJdb6on7swsn+C@kernel.org/
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3442b5e0
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync drm/i915_drm.h with the kernel sources · 06cf00c4
      Arnaldo Carvalho de Melo authored
      To pick up the changes in:
      
        e5e32171 ("drm/i915/guc: Connect UAPI to GuC multi-lrc interface")
        9409eb35 ("drm/i915: Expose logical engine instance to user")
        ea673f17 ("drm/i915/uapi: Add comment clarifying purpose of I915_TILING_* values")
        d3ac8d42 ("drm/i915/pxp: interfaces for using protected objects")
        cbbd3764
      
       ("drm/i915/pxp: Create the arbitrary session after boot")
      
      That don't add any new ioctl, so no changes in tooling.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
        diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
      
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Huang, Sean Z <sean.z.huang@intel.com>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06cf00c4
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync sound/asound.h with the kernel sources · 37057e74
      Arnaldo Carvalho de Melo authored
      To pick up the changes in:
      
        5aec579e
      
       ("ALSA: uapi: Fix a C++ style comment in asound.h")
      
      That is just changing a // style comment to /* */.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
        diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
      
      Cc: Takashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      37057e74
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync linux/prctl.h with the kernel sources · 49024204
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        61bc346c
      
       ("uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument")
      
      That don't result in any changes in tooling:
      
        $ tools/perf/trace/beauty/prctl_option.sh > before
        $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
        $ tools/perf/trace/beauty/prctl_option.sh > after
        $ diff -u before after
        $
      
      Just silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
        diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Eugene Syromiatnikov <esyr@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49024204
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync arch prctl headers with the kernel sources · 5b749efe
      Arnaldo Carvalho de Melo authored
      To pick the changes in this cset:
      
        db8268df
      
       ("x86/arch_prctl: Add controls for dynamic XSTATE components")
      
      This picks these new prctls:
      
        $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/before
        $ cp arch/x86/include/uapi/asm/prctl.h tools/arch/x86/include/uapi/asm/prctl.h
        $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/after
        $ diff -u /tmp/before /tmp/after
        --- /tmp/before	2021-11-13 10:42:52.787308809 -0300
        +++ /tmp/after	2021-11-13 10:43:02.295558837 -0300
        @@ -6,6 +6,9 @@
         	[0x1004 - 0x1001]= "GET_GS",
         	[0x1011 - 0x1001]= "GET_CPUID",
         	[0x1012 - 0x1001]= "SET_CPUID",
        +	[0x1021 - 0x1001]= "GET_XCOMP_SUPP",
        +	[0x1022 - 0x1001]= "GET_XCOMP_PERM",
        +	[0x1023 - 0x1001]= "REQ_XCOMP_PERM",
         };
      
         #define x86_arch_prctl_codes_2_offset 0x2001
        $
      
      With this 'perf trace' can translate those numbers into strings and use
      the strings in filter expressions:
      
        # perf trace -e prctl
             0.000 ( 0.011 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9c014b7df5)     = 0
             0.032 ( 0.002 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9bb6b51580)     = 0
             5.452 ( 0.003 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfeb70) = 0
             5.468 ( 0.002 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfea70) = 0
            24.494 ( 0.009 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f562a32ae28) = 0
            24.540 ( 0.002 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f563c6d4b30) = 0
           670.281 ( 0.008 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30805c8) = 0
           670.293 ( 0.002 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30800f0) = 0
        ^C#
      
      This addresses these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/prctl.h' differs from latest version at 'arch/x86/include/uapi/asm/prctl.h'
        diff -u tools/arch/x86/include/uapi/asm/prctl.h arch/x86/include/uapi/asm/prctl.h
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/YY%2FER104k852WOTK@kernel.org/T/#u
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b749efe
    • Jiri Olsa's avatar
      perf tools: Add more weak libbpf functions · 2a4898fc
      Jiri Olsa authored
      
      
      We hit the window where perf uses libbpf functions, that did not make it
      to the official libbpf release yet and it's breaking perf build with
      dynamicly linked libbpf.
      
      Fixing this by providing the new interface as weak functions which calls
      the original libbpf functions. Fortunatelly the changes were just
      renames.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20211109140707.1689940-2-jolsa@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2a4898fc
    • Ian Rogers's avatar
      perf bpf: Avoid memory leak from perf_env__insert_btf() · 4924b1f7
      Ian Rogers authored
      perf_env__insert_btf() doesn't insert if a duplicate BTF id is
      encountered and this causes a memory leak. Modify the function to return
      a success/error value and then free the memory if insertion didn't
      happen.
      
      v2. Adds a return -1 when the insertion error occurs in
          perf_env__fetch_btf. This doesn't affect anything as the result is
          never checked.
      
      Fixes: 3792cb2f
      
       ("perf bpf: Save BTF in a rbtree in perf_env")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20211112074525.121633-1-irogers@google.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4924b1f7