Skip to content
  1. Mar 23, 2022
    • Jens Axboe's avatar
      io_uring: bump poll refs to full 31-bits · e2c0cb7c
      Jens Axboe authored
      
      
      The previous commit:
      
      1bc84c40088 ("io_uring: remove poll entry from list when canceling all")
      
      removed a potential overflow condition for the poll references. They
      are currently limited to 20-bits, even if we have 31-bits available. The
      upper bit is used to mark for cancelation.
      
      Bump the poll ref space to 31-bits, making that kind of situation much
      harder to trigger in general. We'll separately add overflow checking
      and handling.
      
      Fixes: aa43477b ("io_uring: poll rework")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e2c0cb7c
  2. Mar 22, 2022
    • Jens Axboe's avatar
      io_uring: remove poll entry from list when canceling all · 61bc84c4
      Jens Axboe authored
      
      
      When the ring is exiting, as part of the shutdown, poll requests are
      removed. But io_poll_remove_all() does not remove entries when finding
      them, and since completions are done out-of-band, we can find and remove
      the same entry multiple times.
      
      We do guard the poll execution by poll ownership, but that does not
      exclude us from reissuing a new one once the previous removal ownership
      goes away.
      
      This can race with poll execution as well, where we then end up seeing
      req->apoll be NULL because a previous task_work requeue finished the
      request.
      
      Remove the poll entry when we find it and get ownership of it. This
      prevents multiple invocations from finding it.
      
      Fixes: aa43477b ("io_uring: poll rework")
      Reported-by: default avatarDylan Yudaken <dylany@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      61bc84c4
  3. Mar 21, 2022
  4. Mar 20, 2022
    • Jens Axboe's avatar
      io_uring: recycle provided before arming poll · abdad709
      Jens Axboe authored
      
      
      We currently have a race where we recycle the selected buffer if poll
      returns IO_APOLL_OK. But that's too late, as the poll could already be
      triggering or have triggered. If that race happens, then we're putting a
      buffer that's already being used.
      
      Fix this by recycling before we arm poll. This does mean that we'll
      sometimes almost instantly re-select the buffer, but it's rare enough in
      testing that it should not pose a performance issue.
      
      Fixes: b1c62645 ("io_uring: recycle provided buffers if request goes async")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      abdad709
  5. Mar 19, 2022
  6. Mar 18, 2022
    • Jens Axboe's avatar
      io_uring: manage provided buffers strictly ordered · dbc7d452
      Jens Axboe authored
      
      
      Workloads using provided buffers benefit from using and returning buffers
      in the right order, and so does TLBs for that matter. Manage the internal
      buffer list in a straight list, rather than use the head buffer as the
      insertion node. Use a hashed list for the buffer group IDs instead of
      xarray, the overhead is much lower this way. xarray provides internal
      locking and other trickery that is handy for some uses cases, but
      io_uring already locks internally for the buffer manipulation and needs
      none of that.
      
      This is good for about a 2% reduction in overhead, combination of the
      improved management and the fact that the workload has an easier time
      bundling back provided buffers.
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dbc7d452
  7. Mar 17, 2022
  8. Mar 16, 2022
    • Dylan Yudaken's avatar
      io_uring: make tracing format consistent · 052ebf1f
      Dylan Yudaken authored
      
      
      Make the tracing formatting for user_data and flags consistent.
      
      Having consistent formatting allows one for example to grep for a specific
      user_data/flags and be able to trace a single sqe through easily.
      
      Change user_data to 0x%llx and flags to 0x%x everywhere. The '0x' is
      useful to disambiguate for example "user_data 100".
      
      Additionally remove the '=' for flags in io_uring_req_failed, again for consistency.
      
      Signed-off-by: default avatarDylan Yudaken <dylany@fb.com>
      Link: https://lore.kernel.org/r/20220316095204.2191498-1-dylany@fb.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      052ebf1f
    • Jens Axboe's avatar
      io_uring: recycle apoll_poll entries · 4d9237e3
      Jens Axboe authored
      
      
      Particularly for networked workloads, io_uring intensively uses its
      poll based backend to get a notification when data/space is available.
      Profiling workloads, we see 3-4% of alloc+free that is directly attributed
      to just the apoll allocation and free (and the rest being skb alloc+free).
      
      For the fast path, we have ctx->uring_lock held already for both issue
      and the inline completions, and we can utilize that to avoid any extra
      locking needed to have a basic recycling cache for the apoll entries on
      both the alloc and free side.
      
      Double poll still requires an allocation. But those are rare and not
      a fast path item.
      
      With the simple cache in place, we see a 3-4% reduction in overhead for
      the workload.
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4d9237e3
  9. Mar 12, 2022
  10. Mar 11, 2022
  11. Mar 10, 2022