Commit b3623a92 authored by Yu Kuai's avatar Yu Kuai Committed by Li Nan
Browse files

dm-raid: delay flushing event_work() after reconfig_mutex is released

mainline inclusion
from mainline-v6.7-rc7
commit db29d79b34d9593179de5f868be45c650923e7b4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I8T02O

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=db29d79b34d9593179de5f868be45c650923e7b4



--------------------------------

After commit db5e653d7c9f ("md: delay choosing sync action to
md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however,
in order to make sure event_work is done, __md_stop() will flush
workqueue with reconfig_mutex grabbed, hence if sync_work is still
pending, deadlock will be triggered.

Fortunately, former pacthes to fix stopping sync_thread already make sure
all sync_work is done already, hence such deadlock is not possible
anymore. However, in order not to cause confusions for people by this
implicit dependency, delay flushing event_work to dm-raid where
'reconfig_mutex' is not held, and add some comments to emphasize that
the workqueue can't be flushed with 'reconfig_mutex'.

Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()")
Depends-on: f52f5c71f3d4 ("md: fix stopping sync thread")
Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
Acked-by: default avatarXiao Ni <xni@redhat.com>
Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
Signed-off-by: default avatarLi Nan <linan122@huawei.com>
parent 97229215
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -3317,6 +3317,9 @@ static void raid_dtr(struct dm_target *ti)
	mddev_lock_nointr(&rs->md);
	md_stop(&rs->md);
	mddev_unlock(&rs->md);

	if (work_pending(&rs->md.event_work))
		flush_work(&rs->md.event_work);
	raid_set_free(rs);
}

+8 −3
Original line number Diff line number Diff line
@@ -82,6 +82,14 @@ static struct module *md_cluster_mod;

static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
static struct workqueue_struct *md_wq;

/*
 * This workqueue is used for sync_work to register new sync_thread, and for
 * del_work to remove rdev, and for event_work that is only set by dm-raid.
 *
 * Noted that sync_work will grab reconfig_mutex, hence never flush this
 * workqueue whith reconfig_mutex grabbed.
 */
static struct workqueue_struct *md_misc_wq;
struct workqueue_struct *md_bitmap_wq;

@@ -6376,9 +6384,6 @@ static void __md_stop(struct mddev *mddev)
	struct md_personality *pers = mddev->pers;
	md_bitmap_destroy(mddev);
	mddev_detach(mddev);
	/* Ensure ->event_work is done */
	if (mddev->event_work.func)
		flush_workqueue(md_misc_wq);
	spin_lock(&mddev->lock);
	mddev->pers = NULL;
	spin_unlock(&mddev->lock);