PCI/DOE: Fix destroy_work_on_stack() race
The following debug object splat was observed in testing: ODEBUG: free active (active state 0) object: 0000000097d23782 object type: work_struct hint: doe_statemachine_work+0x0/0x510 WARNING: CPU: 1 PID: 71 at lib/debugobjects.c:514 debug_print_object+0x7d/0xb0 ... Workqueue: pci 0000:36:00.0 DOE [1 doe_statemachine_work RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: ? debug_print_object+0x7d/0xb0 ? __pfx_doe_statemachine_work+0x10/0x10 debug_object_free.part.0+0x11b/0x150 doe_statemachine_work+0x45e/0x510 process_one_work+0x1d4/0x3c0 This occurs because destroy_work_on_stack() was called after signaling the completion in the calling thread. This creates a race between destroy_work_on_stack() and the task->work struct going out of scope in pci_doe(). Signal the work complete after destroying the work struct. This is safe because signal_task_complete() is the final thing the work item does and the workqueue code is careful not to access the work struct after. Fixes: abf04be0 ("PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y") Link: https://lore.kernel.org/r/20230726-doe-fix-v1-1-af07e614d4dd@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Dan Williams <dan.j.williams@intel.com>
Please register or sign in to comment