MLK-24912-2 crypto: caam - fix RNG vs. hwrng kthread race
commit a067df836f474adbedad8cc910987aef4d4b1497 from https://source.codeaurora.org/external/imx/linux-imx The following stack trace is met when stress-testing suspend/resume: [...] PM: suspend devices took 1.972 seconds [...] SError Interrupt on CPU1, code 0xbf000002 -- SError CPU: 1 PID: 213 Comm: hwrng Not tainted 5.4.70-2.3.0+g72209dedd129 #1 Hardware name: Freescale i.MX8DXL EVK (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : _raw_spin_unlock_bh+0x0/0x28 lr : caam_jr_enqueue+0x24c/0x378 sp : ffff8000127dbd10 x29: ffff8000127dbd10 x28: ffff00003cac5940 x27: 00000000bcb5ef80 x26: 0000000000000010 x25: ffff800011c12000 x24: ffff8000127dbdb8 x23: ffff800010ca2298 x22: ffff00003c8aec10 x21: ffff00003cb5ef80 x20: 00000000ffffff8d x19: 0000000000000010 x18: 000000000000000e x17: 0000000000000001 x16: 0000000000000019 x15: 0000000000000033 x14: 000000000000004c x13: 0000000000000068 x12: ffff800011188e90 x11: ffff00003c897210 x10: 0000000000000026 x9 : 00000000a4dcb313 x8 : 0000000000000000 x7 : 0000000000000001 x6 : ffff800011b59000 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000004 x2 : 0000000000000014 x1 : 00000000000001ec x0 : ffff00003cac5940 Kernel panic - not syncing: Asynchronous SError Interrupt CPU: 1 PID: 213 Comm: hwrng Not tainted 5.4.70-2.3.0+g72209dedd129 #1 Hardware name: Freescale i.MX8DXL EVK (DT) Call trace: dump_backtrace+0x0/0x140 show_stack+0x14/0x20 dump_stack+0xb4/0x114 panic+0x158/0x324 nmi_panic+0x84/0x88 arm64_serror_panic+0x74/0x80 do_serror+0x80/0x138 el1_error+0x84/0xf8 _raw_spin_unlock_bh+0x0/0x28 caam_rng_read_one.isra.0+0x1c8/0x3a0 caam_read+0x80/0xa8 hwrng_fillfn+0x8c/0x140 kthread+0x138/0x158 ret_from_fork+0x10/0x1c SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0002,20002008 Memory Limit: none This happens when: -the generic "hwrng" kthread tries to draw entropy and -the current rng is caam's rng and -the job ring used for caam rng hasn't been resumed yet (after a suspend) The issue has been noticed also in upstream (for TPM device in ChromeOS) and the fix proposed involved making the "hwrng" kthread freezable: 03a3bb7a ("hwrng: core - Freeze khwrng thread during suspend") ff296293 ("random: Support freezable kthreads in add_hwgenerator_randomness()") 59b56948 ("random: Use wait_event_freezable() in add_hwgenerator_randomness()") However, because these commits introduced a regression in virtio-rng (Link: https://lore.kernel.org/lkml/4a45b3e0-ed3a-61d3-bfc6-957c7ba631bb@maciej.szmigiero.name ) they were later reverted in commit 08e97aec ("Revert "hwrng: core - Freeze khwrng thread during suspend"") Since there was no progress in upstream and fixing virtio-rng regression is not trivial, the solution chosen is to unregister / re-register caam rng driver from hwrng during suspend / resume. Signed-off-by:Horia Geantă <horia.geanta@nxp.com> Tested-by:
Iuliana Prodan <iuliana.prodan@nxp.com> Signed-off-by:
Xiaolei Wang <xiaolei.wang@windriver.com>
Loading
Please register or sign in to comment