Commit 740e70cf authored by Li Nan's avatar Li Nan Committed by openeuler-sync-bot
Browse files

md/raid10: fix taks hung in raid10d

hulk inclusion
category: bugfix
bugzilla: 188380, https://gitee.com/openeuler/kernel/issues/I6GISC


CVE: NA

--------------------------------

commit fe630de0 ("md/raid10: avoid deadlock on recovery.") allowed
normal io and sync io to exist at the same time. Task hung will occur as
below:

T1                      T2		T3		T4
raid10d
 handle_read_error
  allow_barrier
   conf->nr_pending--
    -> 0
                        //submit sync io
                        raid10_sync_request
                         raise_barrier
			  ->will not be blocked
			  ...
			//submit to drivers
  raid10_read_request
   wait_barrier
    conf->nr_pending++
     -> 1
					//retry read fail
					raid10_end_read_request
					 reschedule_retry
					  add to retry_list
					  conf->nr_queued++
					   -> 1
							//sync io fail
							end_sync_read
							 __end_sync_read
							  reschedule_retry
							   add to retry_list
					                    conf->nr_queued++
							     -> 2
 ...
 handle_read_error
  freeze_array
   wait nr_pending == nr_queued+1
        ->1	      ->3
   //task hung

retry read and sync io will be added to retry_list(nr_queued->2) if they
fails. raid10d() called handle_read_error() and hung in freeze_array().
nr_queued will not decrease because raid10d is blocked, nr_pending will
not increase because conf->barrier is not released.

Fix it by moving allow_barrier() after raid10_read_request().
raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io
and regular io will not be issued at the same time.

We also removed the check of nr_queued. It can be 0 but don't need to be
blocked. MD_RECOVERY_RUNNING always is set after this patch, because all
sync io is waitting in raise_barrier(), remove it, too.

Fixes: fe630de0 ("md/raid10: avoid deadlock on recovery.")
Signed-off-by: default avatarLi Nan <linan122@huawei.com>
Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
(cherry picked from commit 1fe782f0)
parent 5ac17ab1
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment