Commit 232f776a authored by Chengchang Tang's avatar Chengchang Tang Committed by openeuler-sync-bot
Browse files

RDMA/hns: Fix sleeping in atomic context during DCA unloading

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7A2VV



---------------------------------------------------------------

When running rdma with the kasan version, the following error will
be reported:

[ 9013.831512] BUG: sleeping function called from invalid context at kernel/workqueue.c:3111
[ 9013.839719] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 7938, name: ib_send_bw
[ 9013.847917] preempt_count: 1, expected: 0
[ 9013.851932] RCU nest depth: 0, expected: 0
[ 9013.856034] INFO: lockdep is turned off.
[ 9013.859963] CPU: 28 PID: 7938 Comm: ib_send_bw Kdump: loaded Tainted: G        W  O       6.3.0-rc4+ #1
[ 9013.869360] Hardware name: To be filled by O.E.M. HixxxxEVB 1P 2T2N V2.1.5/To be filled by O.E.M., BIOS HixxxxEVB 1P 2T2N V2.1.5 05/04/  23
[ 9013.881883] Call trace:
[ 9013.884322]  dump_backtrace+0xac/0xf0
[ 9013.887984]  show_stack+0x20/0x38
[ 9013.891294]  dump_stack_lvl+0xbc/0x120
[ 9013.895040]  dump_stack+0x1c/0x28
[ 9013.898350]  __might_resched+0x1c8/0x348
[ 9013.902271]  __might_sleep+0x78/0xf8
[ 9013.905841]  __flush_work+0x118/0x718
[ 9013.909501]  __cancel_work_timer+0x228/0x2d8
[ 9013.913769]  cancel_delayed_work_sync+0x1c/0x30
[ 9013.918296]  cleanup_dca_context+0x48/0x1e8 [hns_roce_hw_v2]
[ 9013.923998]  hns_roce_unregister_udca+0x3c/0x98 [hns_roce_hw_v2]
[ 9013.930043]  hns_roce_dealloc_ucontext+0x15c/0x180 [hns_roce_hw_v2]
[ 9013.936348]  uverbs_destroy_ufile_hw+0xd0/0x160
[ 9013.940877]  ib_uverbs_close+0x3c/0x170
[ 9013.944709]  __fput+0xfc/0x3c8
[ 9013.947761]  ____fput+0x18/0x30
[ 9013.950898]  task_work_run+0x130/0x1b8
[ 9013.954643]  do_exit+0x4f0/0xfa8
[ 9013.957867]  do_group_exit+0x60/0xf8
[ 9013.961437]  get_signal+0xf2c/0x1018
[ 9013.965009]  do_notify_resume+0x2d0/0x1560
[ 9013.969102]  el0_svc+0x98/0xa0
[ 9013.972152]  el0t_64_sync_handler+0xb8/0xc0
[ 9013.976332]  el0t_64_sync+0x1a4/0x1a8

This issue is introduced by DCA aging feature. To stop aging work
during dca unload, cacel_delayed_work_sync() is called which cannot
be used in an atomic context. At the same time, in order to avoid
concurrency in this process, a spin lock is used for protection.

But in fact, the aging of DCA is triggered during IO, and the
unloading process of DCA is included in the process of ucontext detroy
or driver uninstallation. At this time, all the QPs has been released.
So, there is no concurrency in this process, and the spinlock here is
redundant.

This patch removes the unnecessary spinlock.

Fixes: d3caaebd ("RDMA/hns: Optimize user DCA perfermance by sharing DCA status")
Signed-off-by: default avatarChengchang Tang <tangchengchang@huawei.com>
(cherry picked from commit 4fa42930)
parent cfe4854c
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment