Commit 6046a54e authored by Yipeng Zou's avatar Yipeng Zou
Browse files

sched: smart grid: check is active in affinity timer

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZBSR


CVE: NA

----------------------------------------

In the test case like below, there is change going to hang system in
stop affinity timer function.

for i in `seq 1 100000`
do
	echo $1 > /sys/fs/cgroup/cpu/cpu.dynamic_affinity_mode 2> &1
done

[  259.061097] CPU: 0 PID: 10223 Comm: sh Kdump: loaded Not tainted 5.10.0OLK-5.10-SP3_SG+ #102
[  259.061098] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDD, BIOS 1.08 12/14/2019
[  259.061098] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[  259.061099] pc : hrtimer_active+0x2c/0x7c
[  259.061099] lr : hrtimer_cancel+0x3c/0x60
[  259.061100] sp : ffff800167f33c10
[  259.061100] x29: ffff800167f33c10 x28: ffff003006bde040
[  259.061101] x27: 0000000000000000 x26: 0000000000000000
[  259.061102] x25: 0000000000000000 x24: ffff0020b2cdea20
[  259.061103] x23: ffff800167f33d48 x22: ffff800011cc6100
[  259.061105] x21: ffff80001170a120 x20: ffff800011728d88
[  259.061106] x19: ffff00208e53c418 x18: 0000000000000000
[  259.061107] x17: 0000000000000000 x16: 0000000000000000
[  259.061108] x15: 0000aaab02369e80 x14: 0000000000000000
[  259.061109] x13: 0000000000000000 x12: 0000000000000000
[  259.061110] x11: 0000000000000000 x10: 0000000000000008
[  259.061111] x9 : ffff80001017c6bc x8 : 00000000ffffffc9
[  259.061112] x7 : ffff0020cb082581 x6 : 0000000000000000
[  259.061114] x5 : ffff0020cb082582 x4 : 0000000000000000
[  259.061115] x3 : ffff202fffb9adc0 x2 : ffff202fffb9ae00
[  259.061116] x1 : 0000000000003b82 x0 : ffff00208e53c418
[  259.061117] Call trace:
[  259.061117]  hrtimer_active+0x2c/0x7c
[  259.061118]  stop_auto_affinity+0xf0/0x170
[  259.061118]  cpu_affinity_mode_write_u64+0x2c/0x60
[  259.061119]  cgroup_file_write+0x128/0x1b0
[  259.061119]  kernfs_fop_write_iter+0x130/0x1c0
[  259.061120]  new_sync_write+0xec/0x18c
[  259.061121]  vfs_write+0x214/0x2ac
[  259.061121]  ksys_write+0x70/0xfc
[  259.061122]  __arm64_sys_write+0x24/0x30
[  259.061122]  invoke_syscall+0x50/0x11c
[  259.061123]  el0_svc_common.constprop.0+0x158/0x164
[  259.061123]  do_el0_svc+0x2c/0xac
[  259.061123]  el0_svc+0x20/0x30
[  259.061124]  el0_sync_handler+0xb0/0xb4
[  259.061124]  el0_sync+0x160/0x180

The root cause was that hrtimer_cancel are holding spin_lock, which the
timer are running.

Thread:				hrtimer:

acqurie lock			acqurie lock
cancel hrtimer sync		...		// deadlock
...
release lock			release lock

Thread will wait hrtimer exit forever, and the hrtimer are waiting spin_
lock forever.

Fixes: 16980bd872f6 ("sched: Introduce smart grid scheduling strategy for cfs")
Signed-off-by: default avatarYipeng Zou <zouyipeng@huawei.com>
parent 6a008918
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment