Commit 805b016e authored by Chiqijun's avatar Chiqijun Committed by Yang Yingliang
Browse files

net/hinic: Fix reboot -f stuck for a long time



driver inclusion
category: bugfix
bugzilla: 4472

-----------------------------------------------------------------------

After the user executes the reboot -f command, the kernel restart
process is followed. One of the steps is device shutdown. The main
process of reboot is stuck in the hns3 NIC shutdown callback process
and stuck in the kernel's rtnl lock. The reason why the card is locked
in the rtnl is that the system sar, ip, ethtool and other commands are
acquiring the information on the network card through the sys fs
interface. Rtnl lock is required before accessing the device. When
accessing the hinic network card, first lock the rtnl, and then transfer
to the hinic callback hinic_get_link_ksettings to get the data. The
network card sends a request to the chip through the assembly command
and waits for the chip to respond. When the device is shut down, the
chip has been shut down and cannot respond. The driver can only wait for
a timeout (synchronously wait for 10s), then the rtnl lock will be held
for 10 seconds before releasing.

The hinic driver sets the card to the absent state in the shutdown
interface, forcing all commands not to be sent to the chip to avoid the
command being stuck for a long time.

Signed-off-by: default avatarChiqijun <chiqijun@huawei.com>
Reviewed-by: default avatarZengweiliang <zengweiliang.zengweiliang@huawei.com>
Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
parent e38ac7c9
Loading
Loading
Loading
Loading
+13 −0
Original line number Diff line number Diff line
@@ -2409,6 +2409,19 @@ void hinic_free_hwdev(void *hwdev)
	kfree(dev);
}

void hinic_set_api_stop(void *hwdev)
{
	struct hinic_hwdev *dev = hwdev;

	if (!hwdev)
		return;

	dev->chip_present_flag = HINIC_CHIP_ABSENT;
	sdk_info(dev->dev_hdl, "Set card absent\n");
	hinic_force_complete_all(dev);
	sdk_info(dev->dev_hdl, "All messages interacting with the chip will stop\n");
}

void hinic_shutdown_hwdev(void *hwdev)
{
	struct hinic_hwdev *dev = hwdev;
+1 −0
Original line number Diff line number Diff line
@@ -428,6 +428,7 @@ int hinic_init_hwdev(struct hinic_init_para *para);
int hinic_set_vf_dev_cap(void *hwdev);
void hinic_free_hwdev(void *hwdev);
void hinic_shutdown_hwdev(void *hwdev);
void hinic_set_api_stop(void *hwdev);

void hinic_ppf_hwdev_unreg(void *hwdev);
void hinic_ppf_hwdev_reg(void *hwdev, void *ppf_hwdev);
+0 −3
Original line number Diff line number Diff line
@@ -477,9 +477,6 @@ struct hinic_reg_info {

#define PCIE_MSIX_ATTR_ENTRY			0

#define HINIC_CHIP_PRESENT 1
#define HINIC_CHIP_ABSENT 0

struct hinic_cmd_fault_event {
	u8	status;
	u8	version;
+3 −0
Original line number Diff line number Diff line
@@ -48,6 +48,9 @@
#define HINIC_MGMT_STATUS_ERR_LEN         32  /* Length too short or too long */
#define HINIC_MGMT_STATUS_ERR_UNSUPPORT   0xFF /* Feature not supported */

#define HINIC_CHIP_PRESENT 1
#define HINIC_CHIP_ABSENT 0

struct cfg_mgmt_info;
struct rdma_comp_resource;

+3 −0
Original line number Diff line number Diff line
@@ -2729,6 +2729,9 @@ static void hinic_shutdown(struct pci_dev *pdev)
		hinic_shutdown_hwdev(pci_adapter->hwdev);

	pci_disable_device(pdev);

	if (pci_adapter)
		hinic_set_api_stop(pci_adapter->hwdev);
}

#ifdef HAVE_RHEL6_SRIOV_CONFIGURE