Commit c7d95613 authored by Pavel Begunkov's avatar Pavel Begunkov Committed by Jens Axboe
Browse files

io_uring: fix early sqd_list removal sqpoll hangs



[  245.463317] INFO: task iou-sqp-1374:1377 blocked for more than 122 seconds.
[  245.463334] task:iou-sqp-1374    state:D flags:0x00004000
[  245.463345] Call Trace:
[  245.463352]  __schedule+0x36b/0x950
[  245.463376]  schedule+0x68/0xe0
[  245.463385]  __io_uring_cancel+0xfb/0x1a0
[  245.463407]  do_exit+0xc0/0xb40
[  245.463423]  io_sq_thread+0x49b/0x710
[  245.463445]  ret_from_fork+0x22/0x30

It happens when sqpoll forgot to run park_task_work and goes to exit,
then exiting user may remove ctx from sqd_list, and so corresponding
io_sq_thread() -> io_uring_cancel_sqpoll() won't be executed. Hopefully
it just stucks in do_exit() in this case.

Fixes: dbe1bdbb ("io_uring: handle signals for IO threads like a normal thread")
Reported-by: default avatarJoakim Hassila <joj@mac.com>
Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent c60eb049
Loading
Loading
Loading
Loading
+5 −2
Original line number Diff line number Diff line
@@ -6754,6 +6754,9 @@ static int io_sq_thread(void *data)
	current->flags |= PF_NO_SETAFFINITY;

	mutex_lock(&sqd->lock);
	/* a user may had exited before the thread started */
	io_run_task_work_head(&sqd->park_task_work);

	while (!test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state)) {
		int ret;
		bool cap_entries, sqt_spin, needs_sched;
@@ -6770,10 +6773,10 @@ static int io_sq_thread(void *data)
			}
			cond_resched();
			mutex_lock(&sqd->lock);
			if (did_sig)
				break;
			io_run_task_work();
			io_run_task_work_head(&sqd->park_task_work);
			if (did_sig)
				break;
			timeout = jiffies + sqd->sq_thread_idle;
			continue;
		}