Skip to content
  1. Feb 21, 2021
    • Pavel Begunkov's avatar
      io_uring: wait potential ->release() on resurrect · 88f171ab
      Pavel Begunkov authored
      
      
      There is a short window where percpu_refs are already turned zero, but
      we try to do resurrect(). Play nicer and wait for ->release() to happen
      in this case and proceed as everything is ok. One downside for ctx refs
      is that we can ignore signal_pending() on a rare occasion, but someone
      else should check for it later if needed.
      
      Cc: <stable@vger.kernel.org> # 5.5+
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      88f171ab
    • Pavel Begunkov's avatar
      io_uring: keep generic rsrc infra generic · f2303b1f
      Pavel Begunkov authored
      
      
      io_rsrc_ref_quiesce() is a generic resource function, though now it
      was wired to allocate and initialise ref nodes with file-specific
      callbacks/etc. Keep it sane by passing in as a parameters everything we
      need for initialisations, otherwise it will hurt us badly one day.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f2303b1f
    • Pavel Begunkov's avatar
      io_uring: zero ref_node after killing it · e6cb007c
      Pavel Begunkov authored
      
      
      After a rsrc/files reference node's refs are killed, it must never be
      used. And that's how it works, it either assigns a new node or kills the
      whole data table.
      
      Let's explicitly NULL it, that shouldn't be necessary, but if something
      would go wrong I'd rather catch a NULL dereference to using a dangling
      pointer.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e6cb007c
    • Jens Axboe's avatar
      io_uring: make the !CONFIG_NET helpers a bit more robust · 99a10081
      Jens Axboe authored
      
      
      With the prep and prep async split, we now have potentially 3 helpers
      that need to be defined for !CONFIG_NET. Add some helpers to do just
      that.
      
      Fixes the following compile error on !CONFIG_NET:
      
      fs/io_uring.c:6171:10: error: implicit declaration of function
      'io_sendmsg_prep_async'; did you mean 'io_req_prep_async'?
      [-Werror=implicit-function-declaration]
         return io_sendmsg_prep_async(req);
                   ^~~~~~~~~~~~~~~~~~~~~
      	     io_req_prep_async
      
      Fixes: 93642ef8 ("io_uring: split sqe-prep and async setup")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      99a10081
    • Hao Xu's avatar
      io_uring: don't hold uring_lock when calling io_run_task_work* · 8bad28d8
      Hao Xu authored
      
      
      Abaci reported the below issue:
      [  141.400455] hrtimer: interrupt took 205853 ns
      [  189.869316] process 'usr/local/ilogtail/ilogtail_0.16.26' started with executable stack
      [  250.188042]
      [  250.188327] ============================================
      [  250.189015] WARNING: possible recursive locking detected
      [  250.189732] 5.11.0-rc4 #1 Not tainted
      [  250.190267] --------------------------------------------
      [  250.190917] a.out/7363 is trying to acquire lock:
      [  250.191506] ffff888114dbcbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __io_req_task_submit+0x29/0xa0
      [  250.192599]
      [  250.192599] but task is already holding lock:
      [  250.193309] ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
      [  250.194426]
      [  250.194426] other info that might help us debug this:
      [  250.195238]  Possible unsafe locking scenario:
      [  250.195238]
      [  250.196019]        CPU0
      [  250.196411]        ----
      [  250.196803]   lock(&ctx->uring_lock);
      [  250.197420]   lock(&ctx->uring_lock);
      [  250.197966]
      [  250.197966]  *** DEADLOCK ***
      [  250.197966]
      [  250.198837]  May be due to missing lock nesting notation
      [  250.198837]
      [  250.199780] 1 lock held by a.out/7363:
      [  250.200373]  #0: ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
      [  250.201645]
      [  250.201645] stack backtrace:
      [  250.202298] CPU: 0 PID: 7363 Comm: a.out Not tainted 5.11.0-rc4 #1
      [  250.203144] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  250.203887] Call Trace:
      [  250.204302]  dump_stack+0xac/0xe3
      [  250.204804]  __lock_acquire+0xab6/0x13a0
      [  250.205392]  lock_acquire+0x2c3/0x390
      [  250.205928]  ? __io_req_task_submit+0x29/0xa0
      [  250.206541]  __mutex_lock+0xae/0x9f0
      [  250.207071]  ? __io_req_task_submit+0x29/0xa0
      [  250.207745]  ? 0xffffffffa0006083
      [  250.208248]  ? __io_req_task_submit+0x29/0xa0
      [  250.208845]  ? __io_req_task_submit+0x29/0xa0
      [  250.209452]  ? __io_req_task_submit+0x5/0xa0
      [  250.210083]  __io_req_task_submit+0x29/0xa0
      [  250.210687]  io_async_task_func+0x23d/0x4c0
      [  250.211278]  task_work_run+0x89/0xd0
      [  250.211884]  io_run_task_work_sig+0x50/0xc0
      [  250.212464]  io_sqe_files_unregister+0xb2/0x1f0
      [  250.213109]  __io_uring_register+0x115a/0x1750
      [  250.213718]  ? __x64_sys_io_uring_register+0xad/0x210
      [  250.214395]  ? __fget_files+0x15a/0x260
      [  250.214956]  __x64_sys_io_uring_register+0xbe/0x210
      [  250.215620]  ? trace_hardirqs_on+0x46/0x110
      [  250.216205]  do_syscall_64+0x2d/0x40
      [  250.216731]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  250.217455] RIP: 0033:0x7f0fa17e5239
      [  250.218034] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05  3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
      [  250.220343] RSP: 002b:00007f0fa1eeac48 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
      [  250.221360] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fa17e5239
      [  250.222272] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000008
      [  250.223185] RBP: 00007f0fa1eeae20 R08: 0000000000000000 R09: 0000000000000000
      [  250.224091] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [  250.224999] R13: 0000000000021000 R14: 0000000000000000 R15: 00007f0fa1eeb700
      
      This is caused by calling io_run_task_work_sig() to do work under
      uring_lock while the caller io_sqe_files_unregister() already held
      uring_lock.
      To fix this issue, briefly drop uring_lock when calling
      io_run_task_work_sig(), and there are two things to concern:
      
      - hold uring_lock in io_ring_ctx_free() around io_sqe_files_unregister()
          this is for consistency of lock/unlock.
      - add new fixed rsrc ref node before dropping uring_lock
          it's not safe to do io_uring_enter-->percpu_ref_get() with a dying one.
      - check if rsrc_data->refs is dying to avoid parallel io_sqe_files_unregister
      
      Reported-by: default avatarAbaci <abaci@linux.alibaba.com>
      Fixes: 1ffc5422 ("io_uring: fix io_sqe_files_unregister() hangs")
      Suggested-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarHao Xu <haoxu@linux.alibaba.com>
      [axboe: fixes from Pavel folded in]
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8bad28d8
    • Pavel Begunkov's avatar
      io_uring: fail io-wq submission from a task_work · a3df7698
      Pavel Begunkov authored
      
      
      In case of failure io_wq_submit_work() needs to post an CQE and so
      potentially take uring_lock. The safest way to deal with it is to do
      that from under task_work where we can safely take the lock.
      
      Also, as io_iopoll_check() holds the lock tight and releases it
      reluctantly, it will play nicer in the furuter with notifying an
      iopolling task about new such pending failed requests.
      
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a3df7698
  2. Feb 19, 2021
  3. Feb 18, 2021
  4. Feb 17, 2021
  5. Feb 16, 2021
  6. Feb 14, 2021
  7. Feb 13, 2021
  8. Feb 12, 2021
  9. Feb 11, 2021