tipc: fix a deadlock when flushing scheduled work
In the commit fdeba99b ("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying to make sure the tipc_net_finalize_work work item finished if it enqueued. But calling flush_scheduled_work() is not just affecting above work item but either any scheduled work. This has turned out to be overkill and caused to deadlock as syzbot reported: ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc2-next-20200828-syzkaller #0 Not tainted ------------------------------------------------------ kworker/u4:6/349 is trying to acquire lock: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777 but task is already holding lock: ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565 [...] Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pernet_ops_rwsem); lock(&sb->s_type->i_mutex_key#13); lock(pernet_ops_rwsem); lock((wq_completion)events); *** DEADLOCK *** [...] v1: To fix the original issue, we replace above calling by introducing a bit flag. When a namespace cleaned-up, bit flag is set to zero and: - tipc_net_finalize functionial just does return immediately. - tipc_net_finalize_work does not enqueue into the scheduled work queue. v2: Use cancel_work_sync() helper to make sure ONLY the tipc_net_finalize_work() stopped before releasing bcbase object. Reported-by: <syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com> Fixes: fdeba99b ("tipc: fix use-after-free in tipc_bcast_get_mode") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Please register or sign in to comment