Commit 02512915 authored by Laibin Qiu's avatar Laibin Qiu Committed by Yang Yingliang
Browse files

blk-wbt: fix IO hang due to negative inflight counter

hulk inclusion
category: bugfix
bugzilla: 182135, https://gitee.com/openeuler/kernel/issues/I4ENC8


CVE: NA

--------------------------

Block test reported the following stack, Some req has been watting for
wakeup in wbt_wait, and vmcore showed that wbt inflight counter is -1.
So Request cannot be awakened.

PID: 75416  TASK: ffff88836c098000  CPU: 2   COMMAND: "fsstress"
[ffff8882e59a7608] __schedule at ffffffffb2d22a25
[ffff8882e59a7720] schedule at ffffffffb2d2358f
[ffff8882e59a7738] io_schedule at ffffffffb2d23bdc
[ffff8882e59a7750] rq_qos_wait at ffffffffb2400fde
[ffff8882e59a7878] wbt_wait at ffffffffb243a051
[ffff8882e59a7910] __rq_qos_throttle at ffffffffb2400a20
[ffff8882e59a7930] blk_mq_make_request at ffffffffb23de038
[ffff8882e59a7a98] generic_make_request at ffffffffb23c393d
[ffff8882e59a7b80] submit_bio at ffffffffb23c3db8
[ffff8882e59a7c48] submit_bio_wait at ffffffffb23b3a5d
[ffff8882e59a7cf0] blkdev_issue_flush at ffffffffb23c8f4c
[ffff8882e59a7d20] ext4_sync_fs at ffffffffc06dd708 [ext4]
[ffff8882e59a7dd0] sync_filesystem at ffffffffb21e8335
[ffff8882e59a7df8] ovl_sync_fs at ffffffffc0fd853a [overlay]
[ffff8882e59a7e10] sync_fs_one_sb at ffffffffb21e8221
[ffff8882e59a7e30] iterate_supers at ffffffffb218401e
[ffff8882e59a7e70] ksys_sync at ffffffffb21e8588
[ffff8882e59a7f20] __x64_sys_sync at ffffffffb21e861f
[ffff8882e59a7f28] do_syscall_64 at ffffffffb1c06bc8
[ffff8882e59a7f50] entry_SYSCALL_64_after_hwframe at ffffffffb2e000ad
RIP: 00007f479ab13347  RSP: 00007ffd4dda9fe8  RFLAGS: 00000202
RAX: ffffffffffffffda  RBX: 0000000000000068  RCX: 00007f479ab13347
RDX: 0000000000000000  RSI: 000000003e1b142d  RDI: 0000000000000068
RBP: 0000000051eb851f   R8: 00007f479abd4034   R9: 00007f479abd40a0
R10: 0000000000000000  R11: 0000000000000202  R12: 0000000000402c20
R13: 0000000000000001  R14: 0000000000000000  R15: 7fffffffffffffff

The ->inflight counter may be negative (-1) if

1) blk-wbt was disabled when the IO was issued,
which will add inflight count.

2) blk-wbt was enabled before this IO tracked.

3) the ->inflight counter is decreased from
0 to -1 in endio().

This fixes the problem by freezing the queue while enabling wbt,
there is no inflight rq running.

Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
parent 01f486af
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment