Skip to content
  1. Dec 03, 2021
    • Frederic Weisbecker's avatar
      workqueue: Fix unbind_workers() VS wq_worker_running() race · 07edfece
      Frederic Weisbecker authored
      At CPU-hotplug time, unbind_worker() may preempt a worker while it is
      waking up. In that case the following scenario can happen:
      
              unbind_workers()                     wq_worker_running()
              --------------                      -------------------
              	                      if (!(worker->flags & WORKER_NOT_RUNNING))
              	                          //PREEMPTED by unbind_workers
              worker->flags |= WORKER_UNBOUND;
              [...]
              atomic_set(&pool->nr_running, 0);
              //resume to worker
      		                              atomic_inc(&worker->pool->nr_running);
      
      After unbind_worker() resets pool->nr_running, the value is expected to
      remain 0 until the pool ever gets rebound in case cpu_up() is called on
      the target CPU in the future. But here the race leaves pool->nr_running
      with a value of 1, triggering the following warning when the worker goes
      idle:
      
      	WARNING: CPU: 3 PID: 34 at kernel/workqueue.c:1823 worker_enter_idle+0x95/0xc0
      	Modules linked in:
      	CPU: 3 PID: 34 Comm: kworker/3:0 Not tainted 5.16.0-rc1+ #34
      	Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
      	Workqueue:  0x0 (rcu_par_gp)
      	RIP: 0010:worker_enter_idle+0x95/0xc0
      	Code: 04 85 f8 ff ff ff 39 c1 7f 09 48 8b 43 50 48 85 c0 74 1b 83 e2 04 75 99 8b 43 34 39 43 30 75 91 8b 83 00 03 00 00 85 c0 74 87 <0f> 0b 5b c3 48 8b 35 70 f1 37 01 48 8d 7b 48 48 81 c6 e0 93  0
      	RSP: 0000:ffff9b7680277ed0 EFLAGS: 00010086
      	RAX: 00000000ffffffff RBX: ffff93465eae9c00 RCX: 0000000000000000
      	RDX: 0000000000000000 RSI: ffff9346418a0000 RDI: ffff934641057140
      	RBP: ffff934641057170 R08: 0000000000000001 R09: ffff9346418a0080
      	R10: ffff9b768027fdf0 R11: 0000000000002400 R12: ffff93465eae9c20
      	R13: ffff93465eae9c20 R14: ffff93465eae9c70 R15: ffff934641057140
      	FS:  0000000000000000(0000) GS:ffff93465eac0000(0000) knlGS:0000000000000000
      	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      	CR2: 0000000000000000 CR3: 000000001cc0c000 CR4: 00000000000006e0
      	DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      	DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      	Call Trace:
      	  <TASK>
      	  worker_thread+0x89/0x3d0
      	  ? process_one_work+0x400/0x400
      	  kthread+0x162/0x190
      	  ? set_kthread_struct+0x40/0x40
      	  ret_from_fork+0x22/0x30
      	  </TASK>
      
      Also due to this incorrect "nr_running == 1", further queued work may
      end up not being served, because no worker is awaken at work insert time.
      This raises rcutorture writer stalls for example.
      
      Fix this with disabling preemption in the right place in
      wq_worker_running().
      
      It's worth noting that if the worker migrates and runs concurrently with
      unbind_workers(), it is guaranteed to see the WORKER_UNBOUND flag update
      due to set_cpus_allowed_ptr() acquiring/releasing rq->lock.
      
      Fixes: 6d25be57
      
       ("sched/core, workqueues: Distangle worker accounting from rq lock")
      Reviewed-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Tested-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      07edfece
  2. Dec 02, 2021
    • Paul E. McKenney's avatar
      workqueue: Upgrade queue_work_on() comment · 443378f0
      Paul E. McKenney authored
      
      
      The current queue_work_on() docbook comment says that the caller must
      ensure that the specified CPU can't go away, but does not spell out the
      consequences, which turn out to be quite mild.  Therefore expand this
      comment to explicitly say that the penalty for failing to nail down the
      specified CPU is that the workqueue handler might find itself executing
      on some other CPU.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      443378f0
  3. Dec 01, 2021
    • Jason A. Donenfeld's avatar
      MAINTAINERS: co-maintain random.c · 58e1100f
      Jason A. Donenfeld authored
      
      
      random.c is a bit understaffed, and folks want more prompt reviews. I've
      got the crypto background and the interest to do these reviews, and have
      authored parts of the file already.
      
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58e1100f
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f080815f
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM64:
      
         - Fix constant sign extension affecting TCR_EL2 and preventing
           running on ARMv8.7 models due to spurious bits being set
      
         - Fix use of helpers using PSTATE early on exit by always sampling it
           as soon as the exit takes place
      
         - Move pkvm's 32bit handling into a common helper
      
        RISC-V:
      
         - Fix incorrect KVM_MAX_VCPUS value
      
         - Unmap stage2 mapping when deleting/moving a memslot
      
        x86:
      
         - Fix and downgrade BUG_ON due to uninitialized cache
      
         - Many APICv and MOVE_ENC_CONTEXT_FROM fixes
      
         - Correctly emulate TLB flushes around nested vmentry/vmexit and when
           the nested hypervisor uses VPID
      
         - Prevent modifications to CPUID after the VM has run
      
         - Other smaller bugfixes
      
        Generic:
      
         - Memslot handling bugfixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
        KVM: fix avic_set_running for preemptable kernels
        KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
        KVM: SEV: accept signals in sev_lock_two_vms
        KVM: SEV: do not take kvm->lock when destroying
        KVM: SEV: Prohibit migration of a VM that has mirrors
        KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs locked
        selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
        KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
        KVM: SEV: initialize regions_list of a mirror VM
        KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
        KVM: SEV: do not use list_replace_init on an empty list
        KVM: x86: Use a stable condition around all VT-d PI paths
        KVM: x86: check PIR even for vCPUs with disabled APICv
        KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
        KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
        KVM: x86/mmu: Handle "default" period when selectively waking kthread
        KVM: MMU: shadow nested paging does not have PKU
        KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
        KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
        KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
        ...
      f080815f
    • Matthew Wilcox (Oracle)'s avatar
      tools: Fix math.h breakage · d6e6a27d
      Matthew Wilcox (Oracle) authored
      Commit 98e1385e
      
       ("include/linux/radix-tree.h: replace kernel.h with
      the necessary inclusions") broke the radix tree test suite in two
      different ways; first by including math.h which didn't exist in the
      tools directory, and second by removing an implicit include of
      spinlock.h before lockdep.h.  Fix both issues.
      
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Acked-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d6e6a27d
  4. Nov 30, 2021
  5. Nov 29, 2021
  6. Nov 28, 2021