sched: smart grid: check is active in affinity timer
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I7ZBSR CVE: NA ---------------------------------------- In the test case like below, there is change going to hang system in stop affinity timer function. for i in `seq 1 100000` do echo $1 > /sys/fs/cgroup/cpu/cpu.dynamic_affinity_mode 2> &1 done [ 259.061097] CPU: 0 PID: 10223 Comm: sh Kdump: loaded Not tainted 5.10.0OLK-5.10-SP3_SG+ #102 [ 259.061098] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDD, BIOS 1.08 12/14/2019 [ 259.061098] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 259.061099] pc : hrtimer_active+0x2c/0x7c [ 259.061099] lr : hrtimer_cancel+0x3c/0x60 [ 259.061100] sp : ffff800167f33c10 [ 259.061100] x29: ffff800167f33c10 x28: ffff003006bde040 [ 259.061101] x27: 0000000000000000 x26: 0000000000000000 [ 259.061102] x25: 0000000000000000 x24: ffff0020b2cdea20 [ 259.061103] x23: ffff800167f33d48 x22: ffff800011cc6100 [ 259.061105] x21: ffff80001170a120 x20: ffff800011728d88 [ 259.061106] x19: ffff00208e53c418 x18: 0000000000000000 [ 259.061107] x17: 0000000000000000 x16: 0000000000000000 [ 259.061108] x15: 0000aaab02369e80 x14: 0000000000000000 [ 259.061109] x13: 0000000000000000 x12: 0000000000000000 [ 259.061110] x11: 0000000000000000 x10: 0000000000000008 [ 259.061111] x9 : ffff80001017c6bc x8 : 00000000ffffffc9 [ 259.061112] x7 : ffff0020cb082581 x6 : 0000000000000000 [ 259.061114] x5 : ffff0020cb082582 x4 : 0000000000000000 [ 259.061115] x3 : ffff202fffb9adc0 x2 : ffff202fffb9ae00 [ 259.061116] x1 : 0000000000003b82 x0 : ffff00208e53c418 [ 259.061117] Call trace: [ 259.061117] hrtimer_active+0x2c/0x7c [ 259.061118] stop_auto_affinity+0xf0/0x170 [ 259.061118] cpu_affinity_mode_write_u64+0x2c/0x60 [ 259.061119] cgroup_file_write+0x128/0x1b0 [ 259.061119] kernfs_fop_write_iter+0x130/0x1c0 [ 259.061120] new_sync_write+0xec/0x18c [ 259.061121] vfs_write+0x214/0x2ac [ 259.061121] ksys_write+0x70/0xfc [ 259.061122] __arm64_sys_write+0x24/0x30 [ 259.061122] invoke_syscall+0x50/0x11c [ 259.061123] el0_svc_common.constprop.0+0x158/0x164 [ 259.061123] do_el0_svc+0x2c/0xac [ 259.061123] el0_svc+0x20/0x30 [ 259.061124] el0_sync_handler+0xb0/0xb4 [ 259.061124] el0_sync+0x160/0x180 The root cause was that hrtimer_cancel are holding spin_lock, which the timer are running. Thread: hrtimer: acqurie lock acqurie lock cancel hrtimer sync ... // deadlock ... release lock release lock Thread will wait hrtimer exit forever, and the hrtimer are waiting spin_ lock forever. Fixes: 16980bd872f6 ("sched: Introduce smart grid scheduling strategy for cfs") Signed-off-by:Yipeng Zou <zouyipeng@huawei.com>
Loading
Please sign in to comment