Commit a158905c authored by Ran Xiaokai's avatar Ran Xiaokai Committed by liwei
Browse files

cpu/hotplug: Don't offline the last non-isolated CPU

mainline inclusion
from mainline-v6.7-rc1
commit 38685e2a0476127db766f81b1c06019ddc4c9ffa
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9RFL2
CVE: CVE-2023-52831

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=38685e2a0476127db766f81b1c06019ddc4c9ffa



--------------------------------

If a system has isolated CPUs via the "isolcpus=" command line parameter,
then an attempt to offline the last housekeeping CPU will result in a
WARN_ON() when rebuilding the scheduler domains and a subsequent panic due
to and unhandled empty CPU mas in partition_sched_domains_locked().

cpuset_hotplug_workfn()
  rebuild_sched_domains_locked()
    ndoms = generate_sched_domains(&doms, &attr);
      cpumask_and(doms[0], top_cpuset.effective_cpus, housekeeping_cpumask(HK_FLAG_DOMAIN));

Thus results in an empty CPU mask which triggers the warning and then the
subsequent crash:

WARNING: CPU: 4 PID: 80 at kernel/sched/topology.c:2366 build_sched_domains+0x120c/0x1408
Call trace:
 build_sched_domains+0x120c/0x1408
 partition_sched_domains_locked+0x234/0x880
 rebuild_sched_domains_locked+0x37c/0x798
 rebuild_sched_domains+0x30/0x58
 cpuset_hotplug_workfn+0x2a8/0x930

Unable to handle kernel paging request at virtual address fffe80027ab37080
 partition_sched_domains_locked+0x318/0x880
 rebuild_sched_domains_locked+0x37c/0x798

Aside of the resulting crash, it does not make any sense to offline the last
last housekeeping CPU.

Prevent this by masking out the non-housekeeping CPUs when selecting a
target CPU for initiating the CPU unplug operation via the work queue.

Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
Signed-off-by: default avatarRan Xiaokai <ran.xiaokai@zte.com.cn>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/202310171709530660462@zte.com.cn


Conflicts:
	kernel/cpu.c
[commit 04d4e665 ("sched/isolation: Use single feature type while referring to housekeeping cpumask") was nos merged]
Signed-off-by: default avatarliwei <liwei728@huawei.com>
parent f300accf
Loading
Loading
Loading
Loading
+7 −4
Original line number Diff line number Diff line
@@ -1142,12 +1142,15 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
	/*
	 * Ensure that the control task does not run on the to be offlined
	 * CPU to prevent a deadlock against cfs_b->period_timer.
	 * Also keep at least one housekeeping cpu onlined to avoid generating
	 * an empty sched_domain span.
	 */
	cpu = cpumask_any_but(cpu_online_mask, cpu);
	if (cpu >= nr_cpu_ids)
		return -EBUSY;
	for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_FLAG_DOMAIN)) {
		if (cpu != work.cpu)
			return work_on_cpu(cpu, __cpu_down_maps_locked, &work);
	}
	return -EBUSY;
}

static int cpu_down(unsigned int cpu, enum cpuhp_state target)
{