Skip to content
  1. Mar 15, 2021
    • Pavel Begunkov's avatar
      io_uring: fix concurrent parking · 9e138a48
      Pavel Begunkov authored
      
      
      If io_sq_thread_park() of one task got rescheduled right after
      set_bit(), before it gets back to mutex_lock() there can happen
      park()/unpark() by another task with SQPOLL locking again and
      continuing running never seeing that first set_bit(SHOULD_PARK),
      so won't even try to put the mutex down for parking.
      
      It will get parked eventually when SQPOLL drops the lock for reschedule,
      but may be problematic and will get in the way of further fixes.
      
      Account number of tasks waiting for parking with a new atomic variable
      park_pending and adjust SHOULD_PARK accordingly. It doesn't entirely
      replaces SHOULD_PARK bit with this atomic var because it's convenient
      to have it as a bit in the state and will help to do optimisations
      later.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9e138a48
    • Pavel Begunkov's avatar
      io_uring: halt SQO submission on ctx exit · f6d54255
      Pavel Begunkov authored
      
      
      io_sq_thread_finish() is called in io_ring_ctx_free(), so SQPOLL task is
      potentially running submitting new requests. It's not a disaster because
      of using a "try" variant of percpu_ref_get, but is far from nice.
      
      Remove ctx from the sqd ctx list earlier, before cancellation loop, so
      SQPOLL can't find it and so won't submit new requests.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f6d54255
    • Pavel Begunkov's avatar
      io_uring: replace sqd rw_semaphore with mutex · 09a6f4ef
      Pavel Begunkov authored
      
      
      The only user of read-locking of sqd->rw_lock is sq_thread itself, which
      is by definition alone, so we don't really need rw_semaphore, but mutex
      will do. Replace it with a mutex, and kill read-to-write upgrading and
      extra task_work handling in io_sq_thread().
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      09a6f4ef
    • Pavel Begunkov's avatar
      io_uring: fix complete_post use ctx after free · 180f829f
      Pavel Begunkov authored
      
      
      If io_req_complete_post() put not a final ref, we can't rely on the
      request's ctx ref, and so ctx may potentially be freed while
      complete_post() is in io_cqring_ev_posted()/etc.
      
      In that case get an additional ctx reference, and put it in the end, so
      protecting following io_cqring_ev_posted(). And also prolong ctx
      lifetime until spin_unlock happens, as we do with mutexes, so added
      percpu_ref_get() doesn't race with ctx free.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      180f829f
    • Pavel Begunkov's avatar
      io_uring: fix ->flags races by linked timeouts · efe814a4
      Pavel Begunkov authored
      
      
      It's racy to modify req->flags from a not owning context, e.g. linked
      timeout calling req_set_fail_links() for the master request might race
      with that request setting/clearing flags while being executed
      concurrently. Just remove req_set_fail_links(prev) from
      io_link_timeout_fn(), io_async_find_and_cancel() and functions down the
      line take care of setting the fail bit.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      efe814a4
  2. Mar 14, 2021
    • Jens Axboe's avatar
      io_uring: convert io_buffer_idr to XArray · 9e15c3a0
      Jens Axboe authored
      Like we did for the personality idr, convert the IO buffer idr to use
      XArray. This avoids a use-after-free on removal of entries, since idr
      doesn't like doing so from inside an iterator, and it nicely reduces
      the amount of code we need to support this feature.
      
      Fixes: 5a2e745d
      
       ("io_uring: buffer registration infrastructure")
      Cc: stable@vger.kernel.org
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: yangerkun <yangerkun@huawei.com>
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9e15c3a0
  3. Mar 13, 2021
    • Jens Axboe's avatar
      io_uring: allow IO worker threads to be frozen · 16efa4fc
      Jens Axboe authored
      
      
      With the freezer using the proper signaling to notify us of when it's
      time to freeze a thread, we can re-enable normal freezer usage for the
      IO threads. Ensure that SQPOLL, io-wq, and the io-wq manager call
      try_to_freeze() appropriately, and remove the default setting of
      PF_NOFREEZE from create_io_thread().
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      16efa4fc
    • Jens Axboe's avatar
      kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing · 15b2219f
      Jens Axboe authored
      
      
      Don't send fake signals to PF_IO_WORKER threads, they don't accept
      signals. Just treat them like kthreads in this regard, all they need
      is a wakeup as no forced kernel/user transition is needed.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      15b2219f
    • Pavel Begunkov's avatar
      io_uring: fix OP_ASYNC_CANCEL across tasks · 58f99373
      Pavel Begunkov authored
      
      
      IORING_OP_ASYNC_CANCEL tries io-wq cancellation only for current task.
      If it fails go over tctx_list and try it out for every single tctx.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      58f99373
    • Pavel Begunkov's avatar
      io_uring: cancel sqpoll via task_work · 521d6a73
      Pavel Begunkov authored
      
      
      1) The first problem is io_uring_cancel_sqpoll() ->
      io_uring_cancel_task_requests() basically doing park(); park(); and so
      hanging.
      
      2) Another one is more subtle, when the master task is doing cancellations,
      but SQPOLL task submits in-between the end of the cancellation but
      before finish() requests taking a ref to the ctx, and so eternally
      locking it up.
      
      3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and
      same io_uring_cancel_sqpoll() from the owner task, they race for
      tctx->wait events. And there probably more of them.
      
      Instead do SQPOLL cancellations from within SQPOLL task context via
      task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal
      park()/unpark() during cancellation, which is ugly, subtle and anyway
      doesn't allow to do io_run_task_work() properly.
      
      io_uring_cancel_sqpoll() is called only from SQPOLL task context and
      under sqd locking, so all parking is removed from there. And so,
      io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by
      SQPOLL task, and that spare us from some headache.
      
      Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll,
      which is not used anymore.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      521d6a73
    • Pavel Begunkov's avatar
      io_uring: prevent racy sqd->thread checks · 26984fbf
      Pavel Begunkov authored
      
      
      SQPOLL thread to which we're trying to attach may be going away, it's
      not nice but a more serious problem is if io_sq_offload_create() sees
      sqd->thread==NULL, and tries to init it with a new thread. There are
      tons of ways it can be exploited or fail.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      26984fbf
  4. Mar 12, 2021
  5. Mar 10, 2021
  6. Mar 08, 2021