!2459 Bugfixes for RDMA/hns
Merge Pull Request from: @stinft
Upload seven bugfixes to fix four issues.
1. Problems related to dca:
(1) HNS_ROCE_UCTX_RSP_DCA_FLAGS is set only if HNS_ROCE_UCTX_CONFIG_DCA is configured;
(2) The hr_qp can be a NULL pointer. A check has been added to avoid illegal access.
(3) DCA debugfs is not needed when DCA is not set for this ucontext.
(4) When unregistering the device or destroying of ucontext and accessing dca debugfs concurrently,
there may be a problem of accessing a null pointer. This patch fixes it by delaying the pointer
assignment to null until debugfs has been unregistered.
https://gitee.com/openeuler/kernel/issues/I87LCF
2. Fix printing level of asynchronous events:
The current driver will print all asynchronous events. Some of the print levels are set improperly,
e.g. SRQ limit reach and SRQ last wqe reach, which may also occur during normal operation of the
software. Currently, the information of these event is printed as a warning, which causes a large
amount of printing even during normal use of the application. As a result, the service performance
deteriorates. This patch fixes the printing storms by modifying the print level.
https://gitee.com/openeuler/kernel/issues/I87LIY
3. Fix signed-unsigned mix with relational:
The ib_mtu_enum_to_int() and uverbs_attr_get_len() may returns a negative value. In this case,
mixed comparisons of signed and unsigned types will throw wrong results.
https://gitee.com/openeuler/kernel/issues/I87LLN
4. Fix the concurrency error between bond and reset:
In the concurrency process between setting bond and reset, when the reset process is finished, the
driver detects that bond resource has already been allocated, thus entering the bond recover
process, where the bond state is set to HNS_ROCE_BOND_IS_BONDED. But at this point
the set bond process hasn't been executed yet(i.e. slaves haven't been uninited). This wrong bond
state leads to the abnormal reset result that 2 slaves are both registered as bond device.
Thus delete the bond state setting in bond recover process. Besides, to fix other potential
concurrency errors between bond and reset, some improvements are also added:
(1) For the situation that reset occurs before bond work, add a reset check at the beginning of
bond work. If there is an ongoing reset process, re-queue the bond work until the reset is
finished.
(2) For the situation that reset occurs during bond work, add reset checks to bond init/uninit
process, treating this situation as an abnormal case.
https://gitee.com/openeuler/kernel/issues/I87LSW
Chengchang Tang (6):
RDMA/hns: Fix context dca configuration
RDMA/hns: Fix potential NULL pointer in DCA memory query
RDMA/hns: Fix registering dca debugfs when dca has not been set
RDMA/hns: Fix printing level of asynchronous events
RDMA/hns: Fix signed-unsigned mix with relational
RDMA/hns: Fix unregistering device and accessing to debugfs concurrently
Junxian Huang (1):
RDMA/hns: Fix the concurrency error between bond and reset.
Link:https://gitee.com/openeuler/kernel/pulls/2459
Reviewed-by:
Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by:
Jialin Zhang <zhangjialin11@huawei.com>
Loading
Please sign in to comment