Loading Documentation/fault-injection/notifier-error-inject.txt +0 −30 Original line number Diff line number Diff line Loading @@ -6,41 +6,11 @@ specified notifier chain callbacks. It is useful to test the error handling of notifier call chain failures which is rarely executed. There are kernel modules that can be used to test the following notifiers. * CPU notifier * PM notifier * Memory hotplug notifier * powerpc pSeries reconfig notifier * Netdevice notifier CPU notifier error injection module ----------------------------------- This feature can be used to test the error handling of the CPU notifiers by injecting artificial errors to CPU notifier chain callbacks. If the notifier call chain should be failed with some events notified, write the error code to debugfs interface /sys/kernel/debug/notifier-error-inject/cpu/actions/<notifier event>/error Possible CPU notifier events to be failed are: * CPU_UP_PREPARE * CPU_UP_PREPARE_FROZEN * CPU_DOWN_PREPARE * CPU_DOWN_PREPARE_FROZEN Example1: Inject CPU offline error (-1 == -EPERM) # cd /sys/kernel/debug/notifier-error-inject/cpu # echo -1 > actions/CPU_DOWN_PREPARE/error # echo 0 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: Operation not permitted Example2: inject CPU online error (-2 == -ENOENT) # echo -2 > actions/CPU_UP_PREPARE/error # echo 1 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: No such file or directory PM notifier error injection module ---------------------------------- This feature is controlled through debugfs interface Loading Documentation/power/suspend-and-cpuhotplug.txt +4 −5 Original line number Diff line number Diff line Loading @@ -232,7 +232,7 @@ d. Handling microcode update during suspend/hibernate: hibernate/restore cycle.] In the current design of the kernel however, during a CPU offline operation as part of the suspend/hibernate cycle (the CPU_DEAD_FROZEN notification), as part of the suspend/hibernate cycle (cpuhp_tasks_frozen is set), the existing copy of microcode image in the kernel is not freed up. And during the CPU online operations (during resume/restore), since the kernel finds that it already has copies of the microcode images for all the Loading @@ -252,10 +252,9 @@ Yes, they are listed below: the _cpu_down() and _cpu_up() functions is *always* 0. This might not reflect the true current state of the system, since the tasks could have been frozen by an out-of-band event such as a suspend operation in progress. Hence, it will lead to wrong notifications being sent during the cpu online/offline events (eg, CPU_ONLINE notification instead of CPU_ONLINE_FROZEN) which in turn will lead to execution of inappropriate code by the callbacks registered for such CPU hotplug events. operation in progress. Hence, the cpuhp_tasks_frozen variable will not reflect the frozen state and the CPU hotplug callbacks which evaluate that variable might execute the wrong code path. 2. If a regular CPU hotplug stress test happens to race with the freezer due to a suspend operation in progress at the same time, then we could hit the Loading include/linux/cpu.h +10 −17 Original line number Diff line number Diff line Loading @@ -55,24 +55,17 @@ extern void unregister_cpu(struct cpu *cpu); extern ssize_t arch_cpu_probe(const char *, size_t); extern ssize_t arch_cpu_release(const char *, size_t); #endif struct notifier_block; #define CPU_ONLINE 0x0002 /* CPU (unsigned)v is up */ #define CPU_UP_PREPARE 0x0003 /* CPU (unsigned)v coming up */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ #define CPU_POST_DEAD 0x0009 /* CPU (unsigned)v dead, cpu_hotplug * lock is dropped */ #define CPU_BROKEN 0x000B /* CPU (unsigned)v did not die properly, * perhaps due to preemption. */ /* Used for CPU hotplug events occurring while tasks are frozen due to a suspend * operation in progress */ #define CPU_TASKS_FROZEN 0x0010 #define CPU_ONLINE_FROZEN (CPU_ONLINE | CPU_TASKS_FROZEN) #define CPU_UP_PREPARE_FROZEN (CPU_UP_PREPARE | CPU_TASKS_FROZEN) #define CPU_DEAD_FROZEN (CPU_DEAD | CPU_TASKS_FROZEN) /* * These states are not related to the core CPU hotplug mechanism. They are * used by various (sub)architectures to track internal state */ #define CPU_ONLINE 0x0002 /* CPU is up */ #define CPU_UP_PREPARE 0x0003 /* CPU coming up */ #define CPU_DEAD 0x0007 /* CPU dead */ #define CPU_DEAD_FROZEN 0x0008 /* CPU timed out on unplug */ #define CPU_POST_DEAD 0x0009 /* CPU successfully unplugged */ #define CPU_BROKEN 0x000B /* CPU did not die properly */ #ifdef CONFIG_SMP extern bool cpuhp_tasks_frozen; Loading Loading
Documentation/fault-injection/notifier-error-inject.txt +0 −30 Original line number Diff line number Diff line Loading @@ -6,41 +6,11 @@ specified notifier chain callbacks. It is useful to test the error handling of notifier call chain failures which is rarely executed. There are kernel modules that can be used to test the following notifiers. * CPU notifier * PM notifier * Memory hotplug notifier * powerpc pSeries reconfig notifier * Netdevice notifier CPU notifier error injection module ----------------------------------- This feature can be used to test the error handling of the CPU notifiers by injecting artificial errors to CPU notifier chain callbacks. If the notifier call chain should be failed with some events notified, write the error code to debugfs interface /sys/kernel/debug/notifier-error-inject/cpu/actions/<notifier event>/error Possible CPU notifier events to be failed are: * CPU_UP_PREPARE * CPU_UP_PREPARE_FROZEN * CPU_DOWN_PREPARE * CPU_DOWN_PREPARE_FROZEN Example1: Inject CPU offline error (-1 == -EPERM) # cd /sys/kernel/debug/notifier-error-inject/cpu # echo -1 > actions/CPU_DOWN_PREPARE/error # echo 0 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: Operation not permitted Example2: inject CPU online error (-2 == -ENOENT) # echo -2 > actions/CPU_UP_PREPARE/error # echo 1 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: No such file or directory PM notifier error injection module ---------------------------------- This feature is controlled through debugfs interface Loading
Documentation/power/suspend-and-cpuhotplug.txt +4 −5 Original line number Diff line number Diff line Loading @@ -232,7 +232,7 @@ d. Handling microcode update during suspend/hibernate: hibernate/restore cycle.] In the current design of the kernel however, during a CPU offline operation as part of the suspend/hibernate cycle (the CPU_DEAD_FROZEN notification), as part of the suspend/hibernate cycle (cpuhp_tasks_frozen is set), the existing copy of microcode image in the kernel is not freed up. And during the CPU online operations (during resume/restore), since the kernel finds that it already has copies of the microcode images for all the Loading @@ -252,10 +252,9 @@ Yes, they are listed below: the _cpu_down() and _cpu_up() functions is *always* 0. This might not reflect the true current state of the system, since the tasks could have been frozen by an out-of-band event such as a suspend operation in progress. Hence, it will lead to wrong notifications being sent during the cpu online/offline events (eg, CPU_ONLINE notification instead of CPU_ONLINE_FROZEN) which in turn will lead to execution of inappropriate code by the callbacks registered for such CPU hotplug events. operation in progress. Hence, the cpuhp_tasks_frozen variable will not reflect the frozen state and the CPU hotplug callbacks which evaluate that variable might execute the wrong code path. 2. If a regular CPU hotplug stress test happens to race with the freezer due to a suspend operation in progress at the same time, then we could hit the Loading
include/linux/cpu.h +10 −17 Original line number Diff line number Diff line Loading @@ -55,24 +55,17 @@ extern void unregister_cpu(struct cpu *cpu); extern ssize_t arch_cpu_probe(const char *, size_t); extern ssize_t arch_cpu_release(const char *, size_t); #endif struct notifier_block; #define CPU_ONLINE 0x0002 /* CPU (unsigned)v is up */ #define CPU_UP_PREPARE 0x0003 /* CPU (unsigned)v coming up */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ #define CPU_POST_DEAD 0x0009 /* CPU (unsigned)v dead, cpu_hotplug * lock is dropped */ #define CPU_BROKEN 0x000B /* CPU (unsigned)v did not die properly, * perhaps due to preemption. */ /* Used for CPU hotplug events occurring while tasks are frozen due to a suspend * operation in progress */ #define CPU_TASKS_FROZEN 0x0010 #define CPU_ONLINE_FROZEN (CPU_ONLINE | CPU_TASKS_FROZEN) #define CPU_UP_PREPARE_FROZEN (CPU_UP_PREPARE | CPU_TASKS_FROZEN) #define CPU_DEAD_FROZEN (CPU_DEAD | CPU_TASKS_FROZEN) /* * These states are not related to the core CPU hotplug mechanism. They are * used by various (sub)architectures to track internal state */ #define CPU_ONLINE 0x0002 /* CPU is up */ #define CPU_UP_PREPARE 0x0003 /* CPU coming up */ #define CPU_DEAD 0x0007 /* CPU dead */ #define CPU_DEAD_FROZEN 0x0008 /* CPU timed out on unplug */ #define CPU_POST_DEAD 0x0009 /* CPU successfully unplugged */ #define CPU_BROKEN 0x000B /* CPU did not die properly */ #ifdef CONFIG_SMP extern bool cpuhp_tasks_frozen; Loading