Skip to content
  1. Dec 10, 2021
  2. Dec 09, 2021
  3. Nov 30, 2021
  4. Nov 29, 2021
  5. Nov 28, 2021
  6. Nov 27, 2021
    • Ye Bin's avatar
      io_uring: Fix undefined-behaviour in io_issue_sqe · f6223ff7
      Ye Bin authored
      
      
      We got issue as follows:
      ================================================================================
      UBSAN: Undefined behaviour in ./include/linux/ktime.h:42:14
      signed integer overflow:
      -4966321760114568020 * 1000000000 cannot be represented in type 'long long int'
      CPU: 1 PID: 2186 Comm: syz-executor.2 Not tainted 4.19.90+ #12
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78
       show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x170/0x1dc lib/dump_stack.c:118
       ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161
       handle_overflow+0x188/0x1dc lib/ubsan.c:192
       __ubsan_handle_mul_overflow+0x34/0x44 lib/ubsan.c:213
       ktime_set include/linux/ktime.h:42 [inline]
       timespec64_to_ktime include/linux/ktime.h:78 [inline]
       io_timeout fs/io_uring.c:5153 [inline]
       io_issue_sqe+0x42c8/0x4550 fs/io_uring.c:5599
       __io_queue_sqe+0x1b0/0xbc0 fs/io_uring.c:5988
       io_queue_sqe+0x1ac/0x248 fs/io_uring.c:6067
       io_submit_sqe fs/io_uring.c:6137 [inline]
       io_submit_sqes+0xed8/0x1c88 fs/io_uring.c:6331
       __do_sys_io_uring_enter fs/io_uring.c:8170 [inline]
       __se_sys_io_uring_enter fs/io_uring.c:8129 [inline]
       __arm64_sys_io_uring_enter+0x490/0x980 fs/io_uring.c:8129
       invoke_syscall arch/arm64/kernel/syscall.c:53 [inline]
       el0_svc_common+0x374/0x570 arch/arm64/kernel/syscall.c:121
       el0_svc_handler+0x190/0x260 arch/arm64/kernel/syscall.c:190
       el0_svc+0x10/0x218 arch/arm64/kernel/entry.S:1017
      ================================================================================
      
      As ktime_set only judge 'secs' if big than KTIME_SEC_MAX, but if we pass
      negative value maybe lead to overflow.
      To address this issue, we must check if 'sec' is negative.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211118015907.844807-1-yebin10@huawei.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f6223ff7
    • Ye Bin's avatar
      io_uring: fix soft lockup when call __io_remove_buffers · 1d0254e6
      Ye Bin authored
      I got issue as follows:
      [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      [  594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  594.364987] Modules linked in:
      [  594.365405] irq event stamp: 604180238
      [  594.365906] hardirqs last  enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50
      [  594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0
      [  594.368420] softirqs last  enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e
      [  594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250
      [  594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G            L    5.15.0-next-20211112+ #88
      [  594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
      [  594.373604] Workqueue: events_unbound io_ring_exit_work
      [  594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50
      [  594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474
      [  594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202
      [  594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106
      [  594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001
      [  594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab
      [  594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0
      [  594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000
      [  594.382787] FS:  0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000
      [  594.383851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0
      [  594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  594.387403] Call Trace:
      [  594.387738]  <TASK>
      [  594.388042]  find_and_remove_object+0x118/0x160
      [  594.389321]  delete_object_full+0xc/0x20
      [  594.389852]  kfree+0x193/0x470
      [  594.390275]  __io_remove_buffers.part.0+0xed/0x147
      [  594.390931]  io_ring_ctx_free+0x342/0x6a2
      [  594.392159]  io_ring_exit_work+0x41e/0x486
      [  594.396419]  process_one_work+0x906/0x15a0
      [  594.399185]  worker_thread+0x8b/0xd80
      [  594.400259]  kthread+0x3bf/0x4a0
      [  594.401847]  ret_from_fork+0x22/0x30
      [  594.402343]  </TASK>
      
      Message from syslogd@localhost at Nov 13 09:09:54 ...
      kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108]
      [  596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680
      
      We can reproduce this issue by follow syzkaller log:
      r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0)
      sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0)
      syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0)
      io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0)
      
      The reason above issue  is 'buf->list' has 2,100,000 nodes, occupied cpu lead
      to soft lockup.
      To solve this issue, we need add schedule point when do while loop in
      '__io_remove_buffers'.
      After add  schedule point we do regression, get follow data.
      [  240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00
      [  296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      [  305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180
      [  325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00
      [  333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380
      ...
      
      Fixes:8bab4c09
      
      ("io_uring: allow conditional reschedule for intensive iterators")
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Link: https://lore.kernel.org/r/20211122024737.2198530-1-yebin10@huawei.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1d0254e6
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix pid filtering when triggers are attached · a55f224f
      Steven Rostedt (VMware) authored
      If a event is filtered by pid and a trigger that requires processing of
      the event to happen is a attached to the event, the discard portion does
      not take the pid filtering into account, and the event will then be
      recorded when it should not have been.
      
      Cc: stable@vger.kernel.org
      Fixes: 3fdaf80f
      
       ("tracing: Implement event pid filtering")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a55f224f
    • Alex Williamson's avatar
      iommu/vt-d: Fix unmap_pages support · 86dc40c7
      Alex Williamson authored
      When supporting only the .map and .unmap callbacks of iommu_ops,
      the IOMMU driver can make assumptions about the size and alignment
      used for mappings based on the driver provided pgsize_bitmap.  VT-d
      previously used essentially PAGE_MASK for this bitmap as any power
      of two mapping was acceptably filled by native page sizes.
      
      However, with the .map_pages and .unmap_pages interface we're now
      getting page-size and count arguments.  If we simply combine these
      as (page-size * count) and make use of the previous map/unmap
      functions internally, any size and alignment assumptions are very
      different.
      
      As an example, a given vfio device assignment VM will often create
      a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff].  On a system that
      does not support IOMMU super pages, the unmap_pages interface will
      ask to unmap 1024 4KB pages at the base IOVA.  dma_pte_clear_level()
      will recurse down to level 2 of the page table where the first half
      of the pfn range exactly matches the entire pte level.  We clear the
      pte, increment the pfn by the level size, but (oops) the next pte is
      on a new page, so we exit the loop an pop back up a level.  When we
      then update the pfn based on that higher level, we seem to assume
      that the previous pfn value was at the start of the level.  In this
      case the level size is 256K pfns, which we add to the base pfn and
      get a results of 0x7fe00, which is clearly greater than 0x401ff,
      so we're done.  Meanwhile we never cleared the ptes for the remainder
      of the range.  When the VM remaps this range, we're overwriting valid
      ptes and the VT-d driver complains loudly, as reported by the user
      report linked below.
      
      The fix for this seems relatively simple, if each iteration of the
      loop in dma_pte_clear_level() is assumed to clear to the end of the
      level pte page, then our next pfn should be calculated from level_pfn
      rather than our working pfn.
      
      Fixes: 3f34f125
      
       ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback")
      Reported-by: default avatarAjay Garg <ajaygargnsit@gmail.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Tested-by: default avatarGiovanni Cabiddu <giovanni.cabiddu@intel.com>
      Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/
      Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen
      
      
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com
      
      
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      86dc40c7
    • Christophe JAILLET's avatar
      iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock() · 4e5973dd
      Christophe JAILLET authored
      If we return -EOPNOTSUPP, the rcu lock remains lock. This is spurious.
      Go through the end of the function instead. This way, the missing
      'rcu_read_unlock()' is called.
      
      Fixes: 7afd7f6a
      
       ("iommu/vt-d: Check FL and SL capability sanity in scalable mode")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Link: https://lore.kernel.org/r/40cc077ca5f543614eab2a10e84d29dd190273f6.1636217517.git.christophe.jaillet@wanadoo.fr
      
      
      Signed-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211126135556.397932-2-baolu.lu@linux.intel.com
      
      
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      4e5973dd
    • Alex Bee's avatar
      iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568 · f7ff3cff
      Alex Bee authored
      With the submission of iommu driver for RK3568 a subtle bug was
      introduced: PAGE_DESC_HI_MASK1 and PAGE_DESC_HI_MASK2 have to be
      the other way arround - that leads to random errors, especially when
      addresses beyond 32 bit are used.
      
      Fix it.
      
      Fixes: c55356c5
      
       ("iommu: rockchip: Add support for iommu v2")
      Signed-off-by: default avatarAlex Bee <knaerzche@gmail.com>
      Tested-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Reviewed-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Tested-by: default avatarDan Johansen <strit@manjaro.org>
      Reviewed-by: default avatarBenjamin Gaignard <benjamin.gaignard@collabora.com>
      Link: https://lore.kernel.org/r/20211124021325.858139-1-knaerzche@gmail.com
      
      
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      f7ff3cff
    • Joerg Roedel's avatar
      iommu/amd: Clarify AMD IOMMUv2 initialization messages · 717e88aa
      Joerg Roedel authored
      
      
      The messages printed on the initialization of the AMD IOMMUv2 driver
      have caused some confusion in the past. Clarify the messages to lower
      the confusion in the future.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.org
      717e88aa
    • Joerg Roedel's avatar
      iommu/vt-d: Remove unused PASID_DISABLED · 21e96a20
      Joerg Roedel authored
      The macro is unused after commit 00ecd540
      
       so it can be removed.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: 00ecd540
      
       ("iommu/vt-d: Clean up unused PASID updating functions")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Reviewed-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Link: https://lore.kernel.org/r/20211123105507.7654-2-joro@8bytes.org
      21e96a20