Skip to content
  1. Mar 27, 2012
    • Peter Zijlstra's avatar
      sched: Fix select_fallback_rq() vs cpu_active/cpu_online · 2baab4e9
      Peter Zijlstra authored
      Commit 5fbd036b ("sched: Cleanup cpu_active madness"), which was
      supposed to finally sort the cpu_active mess, instead uncovered more.
      
      Since CPU_STARTING is ran before setting the cpu online, there's a
      (small) window where the cpu has active,!online.
      
      If during this time there's a wakeup of a task that used to reside on
      that cpu select_task_rq() will use select_fallback_rq() to compute an
      alternative cpu to run on since we find !online.
      
      select_fallback_rq() however will compute the new cpu against
      cpu_active, this means that it can return the same cpu it started out
      with, the !online one, since that cpu is in fact marked active.
      
      This results in us trying to scheduling a task on an offline cpu and
      triggering a WARN in the IPI code.
      
      The solution proposed by Chuansheng Liu of setting cpu_active in
      set_cpu_online() is buggy, firstly not all archs actually use
      set_cpu_online(), secondly, not all archs call set_cpu_online() with
      IRQs disabled, this means we would introduce either the same race or
      the race from fd8a7de1 ("x86: cpu-hotplug: Prevent softirq wakeup on
      wrong CPU") -- albeit much narrower.
      
      [ By setting online first and active later we have a window of
        online,!active, fresh and bound kthreads have task_cpu() of 0 and
        since cpu0 isn't in tsk_cpus_allowed() we end up in
        select_fallback_rq() which excludes !active, resulting in a reset
        of ->cpus_allowed and the thread running all over the place. ]
      
      The solution is to re-work select_fallback_rq() to require active
      _and_ online. This makes the active,!online case work as expected,
      OTOH archs running CPU_STARTING after setting online are now
      vulnerable to the issue from fd8a7de1
      
       -- these are alpha and
      blackfin.
      
      Reported-by: default avatarChuansheng Liu <chuansheng.liu@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: linux-alpha@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-hubqk1i10o4dpvlm06gq7v6j@git.kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2baab4e9
    • Peter Zijlstra's avatar
      sched/x86/smp: Do not enable IRQs over calibrate_delay() · bc758133
      Peter Zijlstra authored
      
      
      We should not ever enable IRQs until we're fully set up. This opens up
      a window where interrupts can hit the cpu and interrupts can do
      wakeups, wakeups need state that isn't set-up yet, in particular this
      cpu isn't elegible to run tasks, so if any cpu-affine task that got
      created in CPU_UP_PREPARE manages to get a wakeup, its affinity mask
      will get broken and we'll run into lots of 'interesting' problems.
      
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-yaezmlbriluh166tfkgni22m@git.kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      bc758133
  2. Mar 23, 2012
  3. Mar 15, 2012
  4. Mar 13, 2012
  5. Mar 11, 2012
  6. Mar 10, 2012
  7. Mar 09, 2012
  8. Mar 08, 2012