Unverified Commit 6b33e607 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!1441 workqueue: fix sanity check warning when invoke destroy_workqueue()

Merge Pull Request from: @henryze 
 
https://gitee.com/openeuler/kernel/issues/I7LRJF?from=project-issue

The warning logs are listed below:

WARNING: CPU: 0 PID: 19336 at kernel/workqueue.c:4430 destroy_workqueue+0x11a/0x2f0
*****
destroy_workqueue: test_workqueue9 has the following busy pwq
  pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=0/1 refcnt=2
      in-flight: 5658:wq_barrier_func
Showing busy workqueues and worker pools:
*****

It shows that even after drain_workqueue() returns, the barrier work item
is still in flight and the pwq (and a worker) is still busy on it.

The problem is caused by drain_workqueue() not watching flush_work():
~~~
Thread A				Worker
					/* normal work item with linked */
					process_scheduled_works()
destroy_workqueue()			  process_one_work()
  drain_workqueue()			    /* run normal work item */
				 /--	    pwq_dec_nr_in_flight()
    flush_workqueue()	    <---/
		/* the last normal work item is done */
  sanity_check				  process_one_work()
				       /--  raw_spin_unlock_irq(&pool->lock)
    raw_spin_lock_irq(&pool->lock)  <-/     /* maybe preempt */
    *WARNING*				    wq_barrier_func()
					    /* maybe preempt by cond_resched() */
~~~
So the solution is to make drain_workqueue() watch for flush_work() which
means making flush_workqueue() watch for flush_work().

Due to historical convenience, we used WORK_NO_COLOR for barrier work items
queued by flush_work().  The color has two purposes:
	Not participate in flushing
	Not participate in nr_active

Only the second purpose is obligatory.  So the plan is to mark barrier
work items inactive without using WORK_NO_COLOR in patch4 so that we can
assign a flushing color to them in patch5.

Patch1-3 are preparation, and patch6 is a cleanup.

Test steps:
insmod wq_issue.ko
rmmod wq_issue

~~~
# insmod wq_issue.ko
[   14.061088] wq_issue: loading out-of-tree module taints kernel.
[   14.070509] wq_test_init
[   14.072112] wq_test_init done
[   14.074035] insmod (92) used greatest stack depth: 13840 bytes left
/tmp # rmmod wq_issue.ko
[   24.489421] wq_test_exit done
/tmp # uname -a
Linux (none) 5.10.0+ #10 SMP Wed Jul 26 15:48:31 CST 2023 x86_64 GNU/Linux
~~~ 
 
Link:https://gitee.com/openeuler/kernel/pulls/1441

 

Reviewed-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: default avatarJialin Zhang <zhangjialin11@huawei.com>
parents 0c40bf67 13b0c50f
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment