Commit 40770d83 authored by Cheng Yu's avatar Cheng Yu
Browse files

sched/topology: Fix cpus hotplug deadlock in check_node_limit()

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I8ZBLR


CVE: NA

-----------------

In some cases, when performing the CPU continuous offline operation,
hungtask will be reported. The error stack is as follows:
  Call track:
    __switch_to+0xc4/0x160
    __schedule+0x8a4/0xcc0
    schedule+0x58/0xf0
    percpu_rwsem_wait+0xac/0x1c0
    __percpu_down_read+0x44/0x100
    cpus_read_lock+0x5c/0x70
    static_key_disable+0x20/0x48
    sched_init_numa+0x588/0x5e8
    sched_update_numa+0x88/0xa0
    sched_cpu_deactivate+0x138/0x2b8
    cpuhp_invoke_callback+0x13c/0x5a0
    cpuhp_thread_fun+0x108/0x1b0
    smpboot_thread_fn+0x14c/0x198
    kthread+0xf0/0x108
    ret_from_fork+0x10/0x20

Before calling check_node_limit(), cpus hotplug write lock has
been locked, and static_branch_disable() will try to get
cpus hotplug read lock, it cause a deadlock issue.
To fix it, use static_branch_disable_cpuslocked to replace
static_branch_disable in check_node_limit to avoid the deadlock issue.

Fixes: c0830508 ("sched/fair: disable stealing if too many NUMA nodes")
Signed-off-by: default avatarCheng Yu <serein.chengyu@huawei.com>
parent f6f9abf1
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -1880,7 +1880,7 @@ static void check_node_limit(void)
	if (sched_steal_node_limit == 0)
		sched_steal_node_limit = SCHED_STEAL_NODE_LIMIT_DEFAULT;
	if (n > sched_steal_node_limit) {
		static_branch_disable(&sched_steal_allow);
		static_branch_disable_cpuslocked(&sched_steal_allow);
		pr_debug("Suppressing sched STEAL. To enable, reboot with sched_steal_node_limit=%d", n);
	}
}