Skip to content
  1. Dec 01, 2021
  2. Nov 29, 2021
    • Linus Torvalds's avatar
      Linux 5.16-rc3 · d58071a8
      Linus Torvalds authored
      d58071a8
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d06c942e
      Linus Torvalds authored
      Pull vhost,virtio,vdpa bugfixes from Michael Tsirkin:
       "Misc fixes all over the place.
      
        Revert of virtio used length validation series: the approach taken
        does not seem to work, breaking too many guests in the process. We'll
        need to do length validation using some other approach"
      
      [ This merge also ends up reverting commit f7a36b03 ("vsock/virtio:
        suppress used length validation"), which came in through the
        networking tree in the meantime, and was part of that whole used
        length validation series   - Linus ]
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa_sim: avoid putting an uninitialized iova_domain
        vhost-vdpa: clean irqs before reseting vdpa device
        virtio-blk: modify the value type of num in virtio_queue_rq()
        vhost/vsock: cleanup removing `len` variable
        vhost/vsock: fix incorrect used length reported to the guest
        Revert "virtio_ring: validate used buffer length"
        Revert "virtio-net: don't let virtio core to validate used length"
        Revert "virtio-blk: don't let virtio core to validate used length"
        Revert "virtio-scsi: don't let virtio core to validate used buffer length"
      d06c942e
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9557e60b
      Linus Torvalds authored
      Pull x86 build fix from Thomas Gleixner:
       "A single fix for a missing __init annotation of prepare_command_line()"
      
      * tag 'x86-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Mark prepare_command_line() __init
      9557e60b
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 97891bbf
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single scheduler fix to ensure that there is no stale KASAN shadow
        state left on the idle task's stack when a CPU is brought up after it
        was brought down before"
      
      * tag 'sched-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/scs: Reset task stack state in bringup_cpu()
      97891bbf
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1ed1d3a3
      Linus Torvalds authored
      Pull perf fix from Thomas Gleixner:
       "A single fix for perf to prevent it from sending SIGTRAP to another
        task from a trace point event as it's not possible to deliver a
        synchronous signal to a different task from there"
      
      * tag 'perf-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Ignore sigtrap for tracepoints destined for other tasks
      1ed1d3a3
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d039f388
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "Two regression fixes for reader writer semaphores:
      
         - Plug a race in the lock handoff which is caused by inconsistency of
           the reader and writer path and can lead to corruption of the
           underlying counter.
      
         - down_read_trylock() is suboptimal when the lock is contended and
           multiple readers trylock concurrently. That's due to the initial
           value being read non-atomically which results in at least two
           compare exchange loops. Making the initial readout atomic reduces
           this significantly. Whith 40 readers by 11% in a benchmark which
           enforces contention on mmap_sem"
      
      * tag 'locking-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem: Optimize down_read_trylock() under highly contended case
        locking/rwsem: Make handoff bit handling more consistent
      d039f388
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.16-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · f8132d62
      Linus Torvalds authored
      Pull another tracing fix from Steven Rostedt:
       "Fix the fix of pid filtering
      
        The setting of the pid filtering flag tested the "trace only this pid"
        case twice, and ignored the "trace everything but this pid" case.
      
        The 5.15 kernel does things a little differently due to the new sparse
        pid mask introduced in 5.16, and as the bug was discovered running the
        5.15 kernel, and the first fix was initially done for that kernel,
        that fix handled both cases (only pid and all but pid), but the
        forward port to 5.16 created this bug"
      
      * tag 'trace-v5.16-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Test the 'Do not trace this pid' case in create event
      f8132d62
  3. Nov 28, 2021
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 0757ca01
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - Intel VT-d fixes:
           - Remove unused PASID_DISABLED
           - Fix RCU locking
           - Fix for the unmap_pages call-back
      
       - Rockchip RK3568 address mask fix
      
       - AMD IOMMUv2 log message clarification
      
      * tag 'iommu-fixes-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Fix unmap_pages support
        iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock()
        iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568
        iommu/amd: Clarify AMD IOMMUv2 initialization messages
        iommu/vt-d: Remove unused PASID_DISABLED
      0757ca01
    • Linus Torvalds's avatar
      Merge tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd · 3498e7f2
      Linus Torvalds authored
      Pull ksmbd fixes from Steve French:
       "Five ksmbd server fixes, four of them for stable:
      
         - memleak fix
      
         - fix for default data stream on filesystems that don't support xattr
      
         - error logging fix
      
         - session setup fix
      
         - minor doc cleanup"
      
      * tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd:
        ksmbd: fix memleak in get_file_stream_info()
        ksmbd: contain default data stream even if xattr is empty
        ksmbd: downgrade addition info error msg to debug in smb2_get_info_sec()
        docs: filesystem: cifs: ksmbd: Fix small layout issues
        ksmbd: Fix an error handling path in 'smb2_sess_setup()'
      3498e7f2
    • Guenter Roeck's avatar
      vmxnet3: Use generic Kconfig option for page size limit · 00169a92
      Guenter Roeck authored
      
      
      Use the architecture independent Kconfig option PAGE_SIZE_LESS_THAN_64KB
      to indicate that VMXNET3 requires a page size smaller than 64kB.
      
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00169a92
    • Guenter Roeck's avatar
      fs: ntfs: Limit NTFS_RW to page sizes smaller than 64k · 4eec7faf
      Guenter Roeck authored
      NTFS_RW code allocates page size dependent arrays on the stack. This
      results in build failures if the page size is 64k or larger.
      
        fs/ntfs/aops.c: In function 'ntfs_write_mst_block':
        fs/ntfs/aops.c:1311:1: error:
      	the frame size of 2240 bytes is larger than 2048 bytes
      
      Since commit f22969a6
      
       ("powerpc/64s: Default to 64K pages for 64 bit
      book3s") this affects ppc:allmodconfig builds, but other architectures
      supporting page sizes of 64k or larger are also affected.
      
      Increasing the maximum frame size for affected architectures just to
      silence this error does not really help.  The frame size would have to
      be set to a really large value for 256k pages.  Also, a large frame size
      could potentially result in stack overruns in this code and elsewhere
      and is therefore not desirable.  Make NTFS_RW dependent on page sizes
      smaller than 64k instead.
      
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Anton Altaparmakov <anton@tuxera.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4eec7faf
    • Guenter Roeck's avatar
      arch: Add generic Kconfig option indicating page size smaller than 64k · 1f0e290c
      Guenter Roeck authored
      
      
      NTFS_RW and VMXNET3 require a page size smaller than 64kB.  Add generic
      Kconfig option for use outside architecture code to avoid architecture
      specific Kconfig options in that code.
      
      Suggested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Anton Altaparmakov <anton@tuxera.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f0e290c
    • Steven Rostedt (VMware)'s avatar
      tracing: Test the 'Do not trace this pid' case in create event · 27ff768f
      Steven Rostedt (VMware) authored
      When creating a new event (via a module, kprobe, eprobe, etc), the
      descriptors that are created must add flags for pid filtering if an
      instance has pid filtering enabled, as the flags are used at the time the
      event is executed to know if pid filtering should be done or not.
      
      The "Only trace this pid" case was added, but a cut and paste error made
      that case checked twice, instead of checking the "Trace all but this pid"
      case.
      
      Link: https://lore.kernel.org/all/202111280401.qC0z99JB-lkp@intel.com/
      
      Fixes: 6cb20650
      
       ("tracing: Check pid filtering when creating events")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      27ff768f
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 4f0dda35
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Fixes for a resource leak and a build robot complaint about totally
        dead code:
      
         - Fix buffer resource leak that could lead to livelock on corrupt fs.
      
         - Remove unused function xfs_inew_wait to shut up the build robots"
      
      * tag 'xfs-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: remove xfs_inew_wait
        xfs: Fix the free logic of state in xfs_attr_node_hasname
      4f0dda35
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · adfb743a
      Linus Torvalds authored
      Pull iomap fixes from Darrick Wong:
       "A single iomap bug fix and a cleanup for 5.16-rc2.
      
        The bug fix changes how iomap deals with reading from an inline data
        region -- whereas the current code (incorrectly) lets the iomap read
        iter try for more bytes after reading the inline region (which zeroes
        the rest of the page!) and hopes the next iteration terminates, we
        surveyed the inlinedata implementations and realized that all
        inlinedata implementations also require that the inlinedata region end
        at EOF, so we can simply terminate the read.
      
        The second patch documents these assumptions in the code so that
        they're not subtle implications anymore, and cleans up some of the
        grosser parts of that function.
      
        Summary:
      
         - Fix an accounting problem where unaligned inline data reads can run
           off the end of the read iomap iterator. iomap has historically
           required that inline data mappings only exist at the end of a file,
           though this wasn't documented anywhere.
      
         - Document iomap_read_inline_data and change its return type to be
           appropriate for the information that it's actually returning"
      
      * tag 'iomap-5.16-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: iomap_read_inline_data cleanup
        iomap: Fix inline extent handling in iomap_readpage
      adfb743a
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.16-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 86155d6b
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Two fixes to event pid filtering:
      
         - Make sure newly created events reflect the current state of pid
           filtering
      
         - Take pid filtering into account when recording trigger events.
           (Also clean up the if statement to be cleaner)"
      
      * tag 'trace-v5.16-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix pid filtering when triggers are attached
        tracing: Check pid filtering when creating events
      86155d6b
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.16-2021-11-27' of git://git.kernel.dk/linux-block · 86799cdf
      Linus Torvalds authored
      Pull more io_uring fixes from Jens Axboe:
       "The locking fixup that was applied earlier this rc has both a deadlock
        and IRQ safety issue, let's get that ironed out before -rc3. This
        contains:
      
         - Link traversal locking fix (Pavel)
      
         - Cancelation fix (Pavel)
      
         - Relocate cond_resched() for huge buffer chain freeing, avoiding a
           softlockup warning (Ye)
      
         - Fix timespec validation (Ye)"
      
      * tag 'io_uring-5.16-2021-11-27' of git://git.kernel.dk/linux-block:
        io_uring: Fix undefined-behaviour in io_issue_sqe
        io_uring: fix soft lockup when call __io_remove_buffers
        io_uring: fix link traversal locking
        io_uring: fail cancellation for EXITING tasks
      86799cdf
    • Linus Torvalds's avatar
      Merge tag 'block-5.16-2021-11-27' of git://git.kernel.dk/linux-block · 650c8edf
      Linus Torvalds authored
      Pull more block fixes from Jens Axboe:
       "Turns out that the flushing out of pending fixes before the
        Thanksgiving break didn't quite work out in terms of timing, so here's
        a followup set of fixes:
      
         - rq_qos_done() should be called regardless of whether or not we're
           the final put of the request, it's not related to the freeing of
           the state. This fixes an IO stall with wbt that a few users have
           reported, a regression in this release.
      
         - Only define zram_wb_devops if it's used, fixing a compilation
           warning for some compilers"
      
      * tag 'block-5.16-2021-11-27' of git://git.kernel.dk/linux-block:
        zram: only make zram_wb_devops for CONFIG_ZRAM_WRITEBACK
        block: call rq_qos_done() before ref check in batch completions
      650c8edf
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 9e9fbe44
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Twelve fixes, eleven in drivers (target, qla2xx, scsi_debug, mpt3sas,
        ufs). The core fix is a minor correction to the previous state update
        fix for the iscsi daemons"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: scsi_debug: Zero clear zones at reset write pointer
        scsi: core: sysfs: Fix setting device state to SDEV_RUNNING
        scsi: scsi_debug: Sanity check block descriptor length in resp_mode_select()
        scsi: target: configfs: Delete unnecessary checks for NULL
        scsi: target: core: Use RCU helpers for INQUIRY t10_alua_tg_pt_gp
        scsi: mpt3sas: Fix incorrect system timestamp
        scsi: mpt3sas: Fix system going into read-only mode
        scsi: mpt3sas: Fix kernel panic during drive powercycle test
        scsi: ufs: ufs-mediatek: Add put_device() after of_find_device_by_node()
        scsi: scsi_debug: Fix type in min_t to avoid stack OOB
        scsi: qla2xxx: edif: Fix off by one bug in qla_edif_app_getfcinfo()
        scsi: ufs: ufshpb: Fix warning in ufshpb_set_hpb_read_to_upiu()
      9e9fbe44
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 74139277
      Linus Torvalds authored
      Pull NFS client fixes from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
      
         - NFSv42: Fix pagecache invalidation after COPY/CLONE
      
        Bugfixes:
      
         - NFSv42: Don't fail clone() just because the server failed to return
           post-op attributes
      
         - SUNRPC: use different lockdep keys for INET6 and LOCAL
      
         - NFSv4.1: handle NFS4ERR_NOSPC from CREATE_SESSION
      
         - SUNRPC: fix header include guard in trace header"
      
      * tag 'nfs-for-5.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: use different lock keys for INET6 and LOCAL
        sunrpc: fix header include guard in trace header
        NFSv4.1: handle NFS4ERR_NOSPC by CREATE_SESSION
        NFSv42: Fix pagecache invalidation after COPY/CLONE
        NFS: Add a tracepoint to show the results of nfs_set_cache_invalid()
        NFSv42: Don't fail clone() unless the OP_CLONE operation failed
      74139277
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-5.16-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 52dc4c64
      Linus Torvalds authored
      Pull erofs fix from Gao Xiang:
       "Fix an ABBA deadlock introduced by XArray conversion"
      
      * tag 'erofs-for-5.16-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: fix deadlock when shrink erofs slab
      52dc4c64
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 7b65b798
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Fix KVM using a Power9 instruction on earlier CPUs, which could lead
        to the host SLB being incorrectly invalidated and a subsequent host
        crash.
      
        Fix kernel hardlockup on vmap stack overflow on 32-bit.
      
        Thanks to Christophe Leroy, Nicholas Piggin, and Fabiano Rosas"
      
      * tag 'powerpc-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32: Fix hardlockup on vmap stack overflow
        KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB
      7b65b798
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes_5.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 6be08803
      Linus Torvalds authored
      Pull MIPS fixes from Thomas Bogendoerfer:
      
       - build fix for ZSTD enabled configs
      
       - fix for preempt warning
      
       - fix for loongson FTLB detection
      
       - fix for page table level selection
      
      * tag 'mips-fixes_5.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48
        MIPS: loongson64: fix FTLB configuration
        MIPS: Fix using smp_processor_id() in preemptible in show_cpuinfo()
        MIPS: boot/compressed/: add __ashldi3 to target for ZSTD compression
      6be08803
  4. Nov 27, 2021
    • Ye Bin's avatar
      io_uring: Fix undefined-behaviour in io_issue_sqe · f6223ff7
      Ye Bin authored
      
      
      We got issue as follows:
      ================================================================================
      UBSAN: Undefined behaviour in ./include/linux/ktime.h:42:14
      signed integer overflow:
      -4966321760114568020 * 1000000000 cannot be represented in type 'long long int'
      CPU: 1 PID: 2186 Comm: syz-executor.2 Not tainted 4.19.90+ #12
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78
       show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x170/0x1dc lib/dump_stack.c:118
       ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161
       handle_overflow+0x188/0x1dc lib/ubsan.c:192
       __ubsan_handle_mul_overflow+0x34/0x44 lib/ubsan.c:213
       ktime_set include/linux/ktime.h:42 [inline]
       timespec64_to_ktime include/linux/ktime.h:78 [inline]
       io_timeout fs/io_uring.c:5153 [inline]
       io_issue_sqe+0x42c8/0x4550 fs/io_uring.c:5599
       __io_queue_sqe+0x1b0/0xbc0 fs/io_uring.c:5988
       io_queue_sqe+0x1ac/0x248 fs/io_uring.c:6067
       io_submit_sqe fs/io_uring.c:6137 [inline]
       io_submit_sqes+0xed8/0x1c88 fs/io_uring.c:6331
       __do_sys_io_uring_enter fs/io_uring.c:8170 [inline]
       __se_sys_io_uring_enter fs/io_uring.c:8129 [inline]
       __arm64_sys_io_uring_enter+0x490/0x980 fs/io_uring.c:8129
       invoke_syscall arch/arm64/kernel/syscall.c:53 [inline]
       el0_svc_common+0x374/0x570 arch/arm64/kernel/syscall.c:121
       el0_svc_handler+0x190/0x260 arch/arm64/kernel/syscall.c:190
       el0_svc+0x10/0x218 arch/arm64/kernel/entry.S:1017
      ================================================================================
      
      As ktime_set only judge 'secs' if big than KTIME_SEC_MAX, but if we pass
      negative value maybe lead to overflow.
      To address this issue, we must check if 'sec' is negative.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211118015907.844807-1-yebin10@huawei.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f6223ff7
    • Ye Bin's avatar
      io_uring: fix soft lockup when call __io_remove_buffers · 1d0254e6
      Ye Bin authored
      I got issue as follows:
      [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      [  594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  594.364987] Modules linked in:
      [  594.365405] irq event stamp: 604180238
      [  594.365906] hardirqs last  enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50
      [  594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0
      [  594.368420] softirqs last  enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e
      [  594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250
      [  594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G            L    5.15.0-next-20211112+ #88
      [  594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      [  594.373604] Workqueue: events_unbound io_ring_exit_work
      [  594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50
      [  594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474
      [  594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202
      [  594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106
      [  594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001
      [  594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab
      [  594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0
      [  594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000
      [  594.382787] FS:  0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000
      [  594.383851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0
      [  594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  594.387403] Call Trace:
      [  594.387738]  <TASK>
      [  594.388042]  find_and_remove_object+0x118/0x160
      [  594.389321]  delete_object_full+0xc/0x20
      [  594.389852]  kfree+0x193/0x470
      [  594.390275]  __io_remove_buffers.part.0+0xed/0x147
      [  594.390931]  io_ring_ctx_free+0x342/0x6a2
      [  594.392159]  io_ring_exit_work+0x41e/0x486
      [  594.396419]  process_one_work+0x906/0x15a0
      [  594.399185]  worker_thread+0x8b/0xd80
      [  594.400259]  kthread+0x3bf/0x4a0
      [  594.401847]  ret_from_fork+0x22/0x30
      [  594.402343]  </TASK>
      
      Message from syslogd@localhost at Nov 13 09:09:54 ...
      kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      
      We can reproduce this issue by follow syzkaller log:
      r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0)
      sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0)
      syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0)
      io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0)
      
      The reason above issue  is 'buf->list' has 2,100,000 nodes, occupied cpu lead
      to soft lockup.
      To solve this issue, we need add schedule point when do while loop in
      '__io_remove_buffers'.
      After add  schedule point we do regression, get follow data.
      [  240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      [  305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00
      [  333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      ...
      
      Fixes:8bab4c09
      
      ("io_uring: allow conditional reschedule for intensive iterators")
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211122024737.2198530-1-yebin10@huawei.com
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1d0254e6
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix pid filtering when triggers are attached · a55f224f
      Steven Rostedt (VMware) authored
      If a event is filtered by pid and a trigger that requires processing of
      the event to happen is a attached to the event, the discard portion does
      not take the pid filtering into account, and the event will then be
      recorded when it should not have been.
      
      Cc: stable@vger.kernel.org
      Fixes: 3fdaf80f
      
       ("tracing: Implement event pid filtering")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a55f224f
    • Alex Williamson's avatar
      iommu/vt-d: Fix unmap_pages support · 86dc40c7
      Alex Williamson authored
      When supporting only the .map and .unmap callbacks of iommu_ops,
      the IOMMU driver can make assumptions about the size and alignment
      used for mappings based on the driver provided pgsize_bitmap.  VT-d
      previously used essentially PAGE_MASK for this bitmap as any power
      of two mapping was acceptably filled by native page sizes.
      
      However, with the .map_pages and .unmap_pages interface we're now
      getting page-size and count arguments.  If we simply combine these
      as (page-size * count) and make use of the previous map/unmap
      functions internally, any size and alignment assumptions are very
      different.
      
      As an example, a given vfio device assignment VM will often create
      a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
      does not support IOMMU super pages, the unmap_pages interface will
      ask to unmap 1024 4KB pages at the base IOVA.  dma_pte_clear_level()
      will recurse down to level 2 of the page table where the first half
      of the pfn range exactly matches the entire pte level.  We clear the
      pte, increment the pfn by the level size, but (oops) the next pte is
      on a new page, so we exit the loop an pop back up a level.  When we
      then update the pfn based on that higher level, we seem to assume
      that the previous pfn value was at the start of the level.  In this
      case the level size is 256K pfns, which we add to the base pfn and
      get a results of 0x7fe00, which is clearly greater than 0x401ff,
      so we're done.  Meanwhile we never cleared the ptes for the remainder
      of the range.  When the VM remaps this range, we're overwriting valid
      ptes and the VT-d driver complains loudly, as reported by the user
      report linked below.
      
      The fix for this seems relatively simple, if each iteration of the
      loop in dma_pte_clear_level() is assumed to clear to the end of the
      level pte page, then our next pfn should be calculated from level_pfn
      rather than our working pfn.
      
      Fixes: 3f34f125
      
       ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback")
      Reported-by: default avatarAjay Garg <ajaygargnsit@gmail.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Tested-by: default avatarGiovanni Cabiddu <giovanni.cabiddu@intel.com>
      Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/
      Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      86dc40c7
    • Christophe JAILLET's avatar
      iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock() · 4e5973dd
      Christophe JAILLET authored
      If we return -EOPNOTSUPP, the rcu lock remains lock. This is spurious.
      Go through the end of the function instead. This way, the missing
      'rcu_read_unlock()' is called.
      
      Fixes: 7afd7f6a
      
       ("iommu/vt-d: Check FL and SL capability sanity in scalable mode")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Link: https://lore.kernel.org/r/40cc077ca5f543614eab2a10e84d29dd190273f6.1636217517.git.christophe.jaillet@wanadoo.fr
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211126135556.397932-2-baolu.lu@linux.intel.com
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      4e5973dd
    • Alex Bee's avatar
      iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568 · f7ff3cff
      Alex Bee authored
      With the submission of iommu driver for RK3568 a subtle bug was
      introduced: PAGE_DESC_HI_MASK1 and PAGE_DESC_HI_MASK2 have to be
      the other way arround - that leads to random errors, especially when
      addresses beyond 32 bit are used.
      
      Fix it.
      
      Fixes: c55356c5
      
       ("iommu: rockchip: Add support for iommu v2")
      Signed-off-by: default avatarAlex Bee <knaerzche@gmail.com>
      Tested-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Reviewed-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Tested-by: default avatarDan Johansen <strit@manjaro.org>
      Reviewed-by: default avatarBenjamin Gaignard <benjamin.gaignard@collabora.com>
      Link: https://lore.kernel.org/r/20211124021325.858139-1-knaerzche@gmail.com
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      f7ff3cff
    • Joerg Roedel's avatar
      iommu/amd: Clarify AMD IOMMUv2 initialization messages · 717e88aa
      Joerg Roedel authored
      
      
      The messages printed on the initialization of the AMD IOMMUv2 driver
      have caused some confusion in the past. Clarify the messages to lower
      the confusion in the future.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.org
      717e88aa
    • Joerg Roedel's avatar
      iommu/vt-d: Remove unused PASID_DISABLED · 21e96a20
      Joerg Roedel authored
      The macro is unused after commit 00ecd540
      
       so it can be removed.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: 00ecd540
      
       ("iommu/vt-d: Clean up unused PASID updating functions")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Reviewed-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211123105507.7654-2-joro@8bytes.org
      21e96a20
    • Linus Torvalds's avatar
      Merge tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c5c17547
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes, including fixes from netfilter.
      
        Current release - regressions:
      
         - r8169: fix incorrect mac address assignment
      
         - vlan: fix underflow for the real_dev refcnt when vlan creation
           fails
      
         - smc: avoid warning of possible recursive locking
      
        Current release - new code bugs:
      
         - vsock/virtio: suppress used length validation
      
         - neigh: fix crash in v6 module initialization error path
      
        Previous releases - regressions:
      
         - af_unix: fix change in behavior in read after shutdown
      
         - igb: fix netpoll exit with traffic, avoid warning
      
         - tls: fix splice_read() when starting mid-record
      
         - lan743x: fix deadlock in lan743x_phy_link_status_change()
      
         - marvell: prestera: fix bridge port operation
      
        Previous releases - always broken:
      
         - tcp_cubic: fix spurious Hystart ACK train detections for
           not-cwnd-limited flows
      
         - nexthop: fix refcount issues when replacing IPv6 groups
      
         - nexthop: fix null pointer dereference when IPv6 is not enabled
      
         - phylink: force link down and retrigger resolve on interface change
      
         - mptcp: fix delack timer length calculation and incorrect early
           clearing
      
         - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds
      
         - nfc: virtual_ncidev: change default device permissions
      
         - netfilter: ctnetlink: fix error codes and flags used for kernel
           side filtering of dumps
      
         - netfilter: flowtable: fix IPv6 tunnel addr match
      
         - ncsi: align payload to 32-bit to fix dropped packets
      
         - iavf: fix deadlock and loss of config during VF interface reset
      
         - ice: avoid bpf_prog refcount underflow
      
         - ocelot: fix broken PTP over IP and PTP API violations
      
        Misc:
      
         - marvell: mvpp2: increase MTU limit when XDP enabled"
      
      * tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
        net: dsa: microchip: implement multi-bridge support
        net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
        net: mscc: ocelot: set up traps for PTP packets
        net: ptp: add a definition for the UDP port for IEEE 1588 general messages
        net: mscc: ocelot: create a function that replaces an existing VCAP filter
        net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
        net: hns3: fix incorrect components info of ethtool --reset command
        net: hns3: fix one incorrect value of page pool info when queried by debugfs
        net: hns3: add check NULL address for page pool
        net: hns3: fix VF RSS failed problem after PF enable multi-TCs
        net: qed: fix the array may be out of bound
        net/smc: Don't call clcsock shutdown twice when smc shutdown
        net: vlan: fix underflow for the real_dev refcnt
        ptp: fix filter names in the documentation
        ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
        nfc: virtual_ncidev: change default device permissions
        net/sched: sch_ets: don't peek at classes beyond 'nbands'
        net: stmmac: Disable Tx queues when reconfiguring the interface
        selftests: tls: test for correct proto_ops
        tls: fix replacing proto_ops
        ...
      c5c17547
    • Oleksij Rempel's avatar
      net: dsa: microchip: implement multi-bridge support · b3612ccd
      Oleksij Rempel authored
      Current driver version is able to handle only one bridge at time.
      Configuring two bridges on two different ports would end up shorting this
      bridges by HW. To reproduce it:
      
      	ip l a name br0 type bridge
      	ip l a name br1 type bridge
      	ip l s dev br0 up
      	ip l s dev br1 up
      	ip l s lan1 master br0
      	ip l s dev lan1 up
      	ip l s lan2 master br1
      	ip l s dev lan2 up
      
      	Ping on lan1 and get response on lan2, which should not happen.
      
      This happened, because current driver version is storing one global "Port VLAN
      Membership" and applying it to all ports which are members of any
      bridge.
      To solve this issue, we need to handle each port separately.
      
      This patch is dropping the global port member storage and calculating
      membership dynamically depending on STP state and bridge participation.
      
      Note: STP support was broken before this patch and should be fixed
      separately.
      
      Fixes: c2e86691
      
       ("net: dsa: microchip: break KSZ9477 DSA driver into two files")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20211126123926.2981028-1-o.rempel@pengutronix.de
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b3612ccd
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 5367cf1c
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix a NULL pointer dereference in the CPPC library code and a
        locking issue related to printing the names of ACPI device nodes in
        the device properties framework.
      
        Specifics:
      
         - Fix NULL pointer dereference in the CPPC library code occuring on
           hybrid systems without CPPC support (Rafael Wysocki).
      
         - Avoid attempts to acquire a semaphore with interrupts off when
           printing the names of ACPI device nodes and clean up code on top of
           that fix (Sakari Ailus)"
      
      * tag 'acpi-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: CPPC: Add NULL pointer check to cppc_get_perf()
        ACPI: Make acpi_node_get_parent() local
        ACPI: Get acpi_device's parent from the parent field
      5367cf1c
    • Linus Torvalds's avatar
      Merge tag 'pm-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 0ce629b1
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These address three issues in the intel_pstate driver and fix two
        problems related to hibernation.
      
        Specifics:
      
         - Make intel_pstate work correctly on Ice Lake server systems with
           out-of-band performance control enabled (Adamos Ttofari).
      
         - Fix EPP handling in intel_pstate during CPU offline and online in
           the active mode (Rafael Wysocki).
      
         - Make intel_pstate support ITMT on asymmetric systems with
           overclocking enabled (Srinivas Pandruvada).
      
         - Fix hibernation image saving when using the user space interface
           based on the snapshot special device file (Evan Green).
      
         - Make the hibernation code release the snapshot block device using
           the same mode that was used when acquiring it (Thomas Zeitlhofer)"
      
      * tag 'pm-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: hibernate: Fix snapshot partial write lengths
        PM: hibernate: use correct mode for swsusp_close()
        cpufreq: intel_pstate: ITMT support for overclocked system
        cpufreq: intel_pstate: Fix active mode offline/online EPP handling
        cpufreq: intel_pstate: Add Ice Lake server to out-of-band IDs
      0ce629b1
    • Linus Torvalds's avatar
      Merge tag 'fuse-fixes-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 925c9437
      Linus Torvalds authored
      Pull fuse fix from Miklos Szeredi:
       "Fix a regression caused by a bugfix in the previous release. The
        symptom is a VM_BUG_ON triggered from splice to the fuse device.
      
        Unfortunately the original bugfix was already backported to a number
        of stable releases, so this fix-fix will need to be backported as
        well"
      
      * tag 'fuse-fixes-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: release pipe buf after last use
      925c9437
    • Jakub Kicinski's avatar
      Merge branch 'fix-broken-ptp-over-ip-on-ocelot-switches' · 32c54497
      Jakub Kicinski authored
      
      
      Vladimir Oltean says:
      
      ====================
      Fix broken PTP over IP on Ocelot switches
      
      Changes in v2: added patch 5, added Richard's ack for the whole series
      sans patch 5 which is new.
      
      Po Liu reported recently that timestamping PTP over IPv4 is broken using
      the felix driver on NXP LS1028A. This has been known for a while, of
      course, since it has always been broken. The reason is because IP PTP
      packets are currently treated as unknown IP multicast, which is not
      flooded to the CPU port in the ocelot driver design, so packets don't
      reach the ptp4l program.
      
      The series solves the problem by installing packet traps per port when
      the timestamping ioctl is called, depending on the RX filter selected
      (L2, L4 or both).
      ====================
      
      Link: https://lore.kernel.org/r/20211126172845.3149260-1-vladimir.oltean@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      32c54497