Skip to content
  1. Jan 07, 2022
    • Vishal Verma's avatar
      md: add support for REQ_NOWAIT · f51d46d0
      Vishal Verma authored
      commit 021a2446 ("block: add QUEUE_FLAG_NOWAIT") added support
      for checking whether a given bdev supports handling of REQ_NOWAIT or not.
      Since then commit 6abc4946
      
       ("dm: add support for REQ_NOWAIT and enable
      it for linear target") added support for REQ_NOWAIT for dm. This uses
      a similar approach to incorporate REQ_NOWAIT for md based bios.
      
      This patch was tested using t/io_uring tool within FIO. A nvme drive
      was partitioned into 2 partitions and a simple raid 0 configuration
      /dev/md0 was created.
      
      md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0]
            937423872 blocks super 1.2 512k chunks
      
      Before patch:
      
      $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
      
      Running top while the above runs:
      
      $ ps -eL | grep $(pidof io_uring)
      
        38396   38396 pts/2    00:00:00 io_uring
        38396   38397 pts/2    00:00:15 io_uring
        38396   38398 pts/2    00:00:13 iou-wrk-38397
      
      We can see iou-wrk-38397 io worker thread created which gets created
      when io_uring sees that the underlying device (/dev/md0 in this case)
      doesn't support nowait.
      
      After patch:
      
      $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100
      
      Running top while the above runs:
      
      $ ps -eL | grep $(pidof io_uring)
      
        38341   38341 pts/2    00:10:22 io_uring
        38341   38342 pts/2    00:10:37 io_uring
      
      After running this patch, we don't see any io worker thread
      being created which indicated that io_uring saw that the
      underlying device does support nowait. This is the exact behaviour
      noticed on a dm device which also supports nowait.
      
      For all the other raid personalities except raid0, we would need
      to train pieces which involves make_request fn in order for them
      to correctly handle REQ_NOWAIT.
      
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarVishal Verma <vverma@digitalocean.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      f51d46d0
    • Mariusz Tkaczyk's avatar
      md: drop queue limitation for RAID1 and RAID10 · a92ce0fe
      Mariusz Tkaczyk authored
      As suggested by Neil Brown[1], this limitation seems to be
      deprecated.
      
      With plugging in use, writes are processed behind the raid thread
      and conf->pending_count is not increased. This limitation occurs only
      if caller doesn't use plugs.
      
      It can be avoided and often it is (with plugging). There are no reports
      that queue is growing to enormous size so remove queue limitation for
      non-plugged IOs too.
      
      [1] https://lore.kernel.org/linux-raid/162496301481.7211.18031090130574610495@noble.neil.brown.name
      
      
      
      Signed-off-by: default avatarMariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      a92ce0fe
    • Davidlohr Bueso's avatar
      md/raid5: play nice with PREEMPT_RT · 770b1d21
      Davidlohr Bueso authored
      
      
      raid_run_ops() relies on the implicitly disabled preemption for
      its percpu ops, although this is really about CPU locality. This
      breaks RT semantics as it can take regular (and thus sleeping)
      spinlocks, such as stripe_lock.
      
      Add a local_lock such that non-RT does not change and continues
      to be just map to preempt_disable/enable, but makes RT happy as
      the region will use a per-CPU spinlock and thus be preemptible
      and still guarantee CPU locality.
      
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      770b1d21
  2. Jan 06, 2022
  3. Jan 04, 2022
  4. Dec 30, 2021
    • Jens Axboe's avatar
      Merge tag 'nvme-5.17-2021-12-29' of git://git.infradead.org/nvme into for-5.17/drivers · 498860df
      Jens Axboe authored
      Pull NVMe updates from Christoph:
      
      "nvme updates for Linux 5.17
      
       - increment request genctr on completion (Keith Busch, Geliang Tang)
       - add a 'iopolicy' module parameter (Hannes Reinecke)
       - print out valid arguments when reading from /dev/nvme-fabrics
         (Hannes Reinecke)"
      
      * tag 'nvme-5.17-2021-12-29' of git://git.infradead.org/nvme:
        nvme: add 'iopolicy' module parameter
        nvme: drop unused variable ctrl in nvme_setup_cmd
        nvme: increment request genctr on completion
        nvme-fabrics: print out valid arguments when reading from /dev/nvme-fabrics
      498860df
  5. Dec 24, 2021
  6. Dec 23, 2021
  7. Dec 17, 2021
  8. Dec 14, 2021
  9. Dec 11, 2021
    • Jens Axboe's avatar
      null_blk: cast command status to integer · c5eafd79
      Jens Axboe authored
      kernel test robot reports that sparse now triggers a warning on null_blk:
      
      >> drivers/block/null_blk/main.c:1577:55: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected int ioerror @@     got restricted blk_status_t [usertype] error @@
         drivers/block/null_blk/main.c:1577:55: sparse:     expected int ioerror
         drivers/block/null_blk/main.c:1577:55: sparse:     got restricted blk_status_t [usertype] error
      
      because blk_mq_add_to_batch() takes an integer instead of a blk_status_t.
      Just cast this to an integer to silence it, null_blk is the odd one out
      here since the command status is the "right" type. If we change the
      function type, then we'll have do that for other callers too (existing and
      future ones).
      
      Fixes: 2385ebf3
      
       ("block: null_blk: batched complete poll requests")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c5eafd79
  10. Dec 10, 2021
  11. Dec 03, 2021
    • Ming Lei's avatar
      block: null_blk: batched complete poll requests · 2385ebf3
      Ming Lei authored
      
      
      Complete poll requests via blk_mq_add_to_batch() and
      blk_mq_end_request_batch(), so that we can cover batched complete
      code path by running null_blk test.
      
      Meantime this way shows ~14% IOPS boost on 't/io_uring /dev/nullb0'
      in my test.
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20211203081703.3506020-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2385ebf3
    • Xiongwei Song's avatar
      floppy: Add max size check for user space request · 545a3249
      Xiongwei Song authored
      
      
      We need to check the max request size that is from user space before
      allocating pages. If the request size exceeds the limit, return -EINVAL.
      This check can avoid the warning below from page allocator.
      
      WARNING: CPU: 3 PID: 16525 at mm/page_alloc.c:5344 current_gfp_context include/linux/sched/mm.h:195 [inline]
      WARNING: CPU: 3 PID: 16525 at mm/page_alloc.c:5344 __alloc_pages+0x45d/0x500 mm/page_alloc.c:5356
      Modules linked in:
      CPU: 3 PID: 16525 Comm: syz-executor.3 Not tainted 5.15.0-syzkaller #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      RIP: 0010:__alloc_pages+0x45d/0x500 mm/page_alloc.c:5344
      Code: be c9 00 00 00 48 c7 c7 20 4a 97 89 c6 05 62 32 a7 0b 01 e8 74 9a 42 07 e9 6a ff ff ff 0f 0b e9 a0 fd ff ff 40 80 e5 3f eb 88 <0f> 0b e9 18 ff ff ff 4c 89 ef 44 89 e6 45 31 ed e8 1e 76 ff ff e9
      RSP: 0018:ffffc90023b87850 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 1ffff92004770f0b RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000033 RDI: 0000000000010cc1
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
      R10: ffffffff81bb4686 R11: 0000000000000001 R12: ffffffff902c1960
      R13: 0000000000000033 R14: 0000000000000000 R15: ffff88804cf64a30
      FS:  0000000000000000(0000) GS:ffff88802cd00000(0063) knlGS:00000000f44b4b40
      CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      CR2: 000000002c921000 CR3: 000000004f507000 CR4: 0000000000150ee0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
       __get_free_pages+0x8/0x40 mm/page_alloc.c:5418
       raw_cmd_copyin drivers/block/floppy.c:3113 [inline]
       raw_cmd_ioctl drivers/block/floppy.c:3160 [inline]
       fd_locked_ioctl+0x12e5/0x2820 drivers/block/floppy.c:3528
       fd_ioctl drivers/block/floppy.c:3555 [inline]
       fd_compat_ioctl+0x891/0x1b60 drivers/block/floppy.c:3869
       compat_blkdev_ioctl+0x3b8/0x810 block/ioctl.c:662
       __do_compat_sys_ioctl+0x1c7/0x290 fs/ioctl.c:972
       do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
       __do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
       do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
       entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
      
      Reported-by: default avatar <syzbot+23a02c7df2cf2bc93fa2@syzkaller.appspotmail.com>
      Link: https://lore.kernel.org/r/20211116131033.27685-1-sxwjean@me.com
      
      
      Signed-off-by: default avatarXiongwei Song <sxwjean@gmail.com>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      545a3249
    • Tasos Sahanidis's avatar
      floppy: Fix hang in watchdog when disk is ejected · fb48febc
      Tasos Sahanidis authored
      When the watchdog detects a disk change, it calls cancel_activity(),
      which in turn tries to cancel the fd_timer delayed work.
      
      In the above scenario, fd_timer_fn is set to fd_watchdog(), meaning
      it is trying to cancel its own work.
      This results in a hang as cancel_delayed_work_sync() is waiting for the
      watchdog (itself) to return, which never happens.
      
      This can be reproduced relatively consistently by attempting to read a
      broken floppy, and ejecting it while IO is being attempted and retried.
      
      To resolve this, this patch calls cancel_delayed_work() instead, which
      cancels the work without waiting for the watchdog to return and finish.
      
      Before this regression was introduced, the code in this section used
      del_timer(), and not del_timer_sync() to delete the watchdog timer.
      
      Link: https://lore.kernel.org/r/399e486c-6540-db27-76aa-7a271b061f76@tasossah.com
      Fixes: 070ad7e7
      
       ("floppy: convert to delayed work and single-thread wq")
      Signed-off-by: default avatarTasos Sahanidis <tasos@tasossah.com>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      fb48febc
    • Ming Lei's avatar
      null_blk: allow zero poll queues · 2bfdbe8b
      Ming Lei authored
      There isn't any reason to not allow zero poll queues from user
      viewpoint.
      
      Also sometimes we need to compare io poll between poll mode and irq
      mode, so not allowing poll queues is bad.
      
      Fixes: 15dfc662
      
       ("null_blk: Fix handling of submit_queues and poll_queues attributes")
      Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20211203023935.3424042-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2bfdbe8b
  12. Nov 29, 2021