Skip to content
  1. Jan 23, 2013
    • Tejun Heo's avatar
      async, kmod: warn on synchronous request_module() from async workers · 0fdff3ec
      Tejun Heo authored
      Synchronous requet_module() from an async worker can lead to deadlock
      because module init path may invoke async_synchronize_full().  The
      async worker waits for request_module() to complete and the module
      loading waits for the async task to finish.  This bug happened in the
      block layer because of default elevator auto-loading.
      
      Block layer has been updated not to do default elevator auto-loading
      and it has been decided to disallow synchronous request_module() from
      async workers.
      
      Trigger WARN_ON_ONCE() on synchronous request_module() from async
      workers.
      
      For more details, please refer to the following thread.
      
        http://thread.gmane.org/gmane.linux.kernel/1420814
      
      
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarAlex Riesen <raa.lkml@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      0fdff3ec
    • Tejun Heo's avatar
      block: don't request module during elevator init · 21c3c5d2
      Tejun Heo authored
      Block layer allows selecting an elevator which is built as a module to
      be selected as system default via kernel param "elevator=".  This is
      achieved by automatically invoking request_module() whenever a new
      block device is initialized and the elevator is not available.
      
      This led to an interesting deadlock problem involving async and module
      init.  Block device probing running off an async job invokes
      request_module().  While the module is being loaded, it performs
      async_synchronize_full() which ends up waiting for the async job which
      is already waiting for request_module() to finish, leading to
      deadlock.
      
      Invoking request_module() from deep in block device init path is
      already nasty in itself.  It seems best to avoid these situations from
      the beginning by moving on-demand module loading out of block init
      path.
      
      The previous patch made sure that the default elevator module is
      loaded early during boot if available.  This patch removes on-demand
      loading of the default elevator from elevator init path.  As the
      module would have been loaded during boot, userland-visible behavior
      difference should be minimal.
      
      For more details, please refer to the following thread.
      
        http://thread.gmane.org/gmane.linux.kernel/1420814
      
      
      
      v2: The bool parameter was named @request_module which conflicted with
          request_module().  This built okay w/ CONFIG_MODULES because
          request_module() was defined as a macro.  W/o CONFIG_MODULES, it
          causes build breakage.  Rename the parameter to @try_loading.
          Reported by Fengguang.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alex Riesen <raa.lkml@gmail.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      21c3c5d2
  2. Jan 19, 2013
    • Tejun Heo's avatar
      init, block: try to load default elevator module early during boot · bb813f4c
      Tejun Heo authored
      
      
      This patch adds default module loading and uses it to load the default
      block elevator.  During boot, it's called right after initramfs or
      initrd is made available and right before control is passed to
      userland.  This ensures that as long as the modules are available in
      the usual places in initramfs, initrd or the root filesystem, the
      default modules are loaded as soon as possible.
      
      This will replace the on-demand elevator module loading from elevator
      init path.
      
      v2: Fixed build breakage when !CONFIG_BLOCK.  Reported by kbuild test
          robot.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alex Riesen <raa.lkml@gmail.com>
      Cc: Fengguang We <fengguang.wu@intel.com>
      bb813f4c
    • Tejun Heo's avatar
      workqueue: implement current_is_async() · 84b233ad
      Tejun Heo authored
      
      
      This function queries whether %current is an async worker executing an
      async item.  This will be used to implement warning on synchronous
      request_module() from async workers.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      84b233ad
    • Tejun Heo's avatar
      workqueue: move struct worker definition to workqueue_internal.h · 2eaebdb3
      Tejun Heo authored
      
      
      This will be used to implement an inline function to query whether
      %current is a workqueue worker and, if so, allow determining which
      work item it's executing.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      2eaebdb3
    • Tejun Heo's avatar
      workqueue: rename kernel/workqueue_sched.h to kernel/workqueue_internal.h · ea138446
      Tejun Heo authored
      
      
      Workqueue wants to expose more interface internal to kernel/.  Instead
      of adding a new header file, repurpose kernel/workqueue_sched.h.
      Rename it to workqueue_internal.h and add include protector.
      
      This patch doesn't introduce any functional changes.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      ea138446
  3. Jan 18, 2013
    • Tejun Heo's avatar
      workqueue: set PF_WQ_WORKER on rescuers · 111c225a
      Tejun Heo authored
      
      
      PF_WQ_WORKER is used to tell scheduler that the task is a workqueue
      worker and needs wq_worker_sleeping/waking_up() invoked on it for
      concurrency management.  As rescuers never participate in concurrency
      management, PF_WQ_WORKER wasn't set on them.
      
      There's a need for an interface which can query whether %current is
      executing a work item and if so which.  Such interface requires a way
      to identify all tasks which may execute work items and PF_WQ_WORKER
      will be used for that.  As all normal workers always have PF_WQ_WORKER
      set, we only need to add it to rescuers.
      
      As rescuers start with WORKER_PREP but never clear it, it's always
      NOT_RUNNING and there's no need to worry about it interfering with
      concurrency management even if PF_WQ_WORKER is set; however, unlike
      normal workers, rescuers currently don't have its worker struct as
      kthread_data().  It uses the associated workqueue_struct instead.
      This is problematic as wq_worker_sleeping/waking_up() expect struct
      worker at kthread_data().
      
      This patch adds worker->rescue_wq and start rescuer kthreads with
      worker struct as kthread_data and sets PF_WQ_WORKER on rescuers.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      111c225a
  4. Dec 20, 2012
    • Tejun Heo's avatar
      workqueue: fix find_worker_executing_work() brekage from hashtable conversion · 023f27d3
      Tejun Heo authored
      42f8570f
      
       ("workqueue: use new hashtable implementation") incorrectly
      made busy workers hashed by the pointer value of worker instead of
      work.  This broke find_worker_executing_work() which in turn broke a
      lot of fundamental operations of workqueue - non-reentrancy and
      flushing among others.  The flush malfunction triggered warning in
      disk event code in Fengguang's automated test.
      
       write_dev_root_ (3265) used greatest stack depth: 2704 bytes left
       ------------[ cut here ]------------
       WARNING: at /c/kernel-tests/src/stable/block/genhd.c:1574 disk_clear_events+0x\
      cf/0x108()
       Hardware name: Bochs
       Modules linked in:
       Pid: 3328, comm: ata_id Not tainted 3.7.0-01930-gbff6343 #1167
       Call Trace:
        [<ffffffff810997c4>] warn_slowpath_common+0x83/0x9c
        [<ffffffff810997f7>] warn_slowpath_null+0x1a/0x1c
        [<ffffffff816aea77>] disk_clear_events+0xcf/0x108
        [<ffffffff811bd8be>] check_disk_change+0x27/0x59
        [<ffffffff822e48e2>] cdrom_open+0x49/0x68b
        [<ffffffff81ab0291>] idecd_open+0x88/0xb7
        [<ffffffff811be58f>] __blkdev_get+0x102/0x3ec
        [<ffffffff811bea08>] blkdev_get+0x18f/0x30f
        [<ffffffff811bebfd>] blkdev_open+0x75/0x80
        [<ffffffff8118f510>] do_dentry_open+0x1ea/0x295
        [<ffffffff8118f5f0>] finish_open+0x35/0x41
        [<ffffffff8119c720>] do_last+0x878/0xa25
        [<ffffffff8119c993>] path_openat+0xc6/0x333
        [<ffffffff8119cf37>] do_filp_open+0x38/0x86
        [<ffffffff81190170>] do_sys_open+0x6c/0xf9
        [<ffffffff8119021e>] sys_open+0x21/0x23
        [<ffffffff82c1c3d9>] system_call_fastpath+0x16/0x1b
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      023f27d3
  5. Dec 19, 2012
    • Tejun Heo's avatar
      workqueue: consider work function when searching for busy work items · a2c1c57b
      Tejun Heo authored
      
      
      To avoid executing the same work item concurrenlty, workqueue hashes
      currently busy workers according to their current work items and looks
      up the the table when it wants to execute a new work item.  If there
      already is a worker which is executing the new work item, the new item
      is queued to the found worker so that it gets executed only after the
      current execution finishes.
      
      Unfortunately, a work item may be freed while being executed and thus
      recycled for different purposes.  If it gets recycled for a different
      work item and queued while the previous execution is still in
      progress, workqueue may make the new work item wait for the old one
      although the two aren't really related in any way.
      
      In extreme cases, this false dependency may lead to deadlock although
      it's extremely unlikely given that there aren't too many self-freeing
      work item users and they usually don't wait for other work items.
      
      To alleviate the problem, record the current work function in each
      busy worker and match it together with the work item address in
      find_worker_executing_work().  While this isn't complete, it ensures
      that unrelated work items don't interact with each other and in the
      very unlikely case where a twisted wq user triggers it, it's always
      onto itself making the culprit easy to spot.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarAndrey Isakov <andy51@gmx.ru>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
      Cc: stable@vger.kernel.org
      a2c1c57b
    • Sasha Levin's avatar
      workqueue: use new hashtable implementation · 42f8570f
      Sasha Levin authored
      Switch workqueues to use the new hashtable implementation. This reduces the
      amount of generic unrelated code in the workqueues.
      
      This patch depends on d9b482c8
      
       ("hashtable: introduce a small and naive
      hashtable") which was merged in v3.6.
      
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      42f8570f
  6. Dec 18, 2012