Skip to content
  1. Jun 08, 2011
    • Wu Fengguang's avatar
      writeback: introduce writeback_control.inodes_written · cb9bd115
      Wu Fengguang authored
      
      
      The flusher works on dirty inodes in batches, and may quit prematurely
      if the batch of inodes happen to be metadata-only dirtied: in this case
      wbc->nr_to_write won't be decreased at all, which stands for "no pages
      written" but also mis-interpreted as "no progress".
      
      So introduce writeback_control.inodes_written to count the inodes get
      cleaned from VFS POV.  A non-zero value means there are some progress on
      writeback, in which case more writeback can be tried.
      
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      cb9bd115
    • Wu Fengguang's avatar
      writeback: update dirtied_when for synced inode to prevent livelock · 94c3dcbb
      Wu Fengguang authored
      Explicitly update .dirtied_when on synced inodes, so that they are no
      longer considered for writeback in the next round.
      
      It can prevent both of the following livelock schemes:
      
      - while true; do echo data >> f; done
      - while true; do touch f;        done (in theory)
      
      The exact livelock condition is, during sync(1):
      
      (1) no new inodes are dirtied
      (2) an inode being actively dirtied
      
      On (2), the inode will be tagged and synced with .nr_to_write=LONG_MAX.
      When finished, it will be redirty_tail()ed because it's still dirty
      and (.nr_to_write > 0). redirty_tail() won't update its ->dirtied_when
      on condition (1). The sync work will then revisit it on the next
      queue_io() and find it eligible again because its old ->dirtied_when
      predates the sync work start time.
      
      We'll do more aggressive "keep writeback as long as we wrote something"
      logic in wb_writeback(). The "use LONG_MAX .nr_to_write" trick in commit
      b9543dac
      
       ("writeback: avoid livelocking WB_SYNC_ALL writeback") will
      no longer be enough to stop sync livelock.
      
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      94c3dcbb
    • Wu Fengguang's avatar
      writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage · 6e6938b6
      Wu Fengguang authored
      sync(2) is performed in two stages: the WB_SYNC_NONE sync and the
      WB_SYNC_ALL sync. Identify the first stage with .tagged_writepages and
      do livelock prevention for it, too.
      
      Jan's commit f446daae ("mm: implement writeback livelock avoidance
      using page tagging") is a partial fix in that it only fixed the
      WB_SYNC_ALL phase livelock.
      
      Although ext4 is tested to no longer livelock with commit f446daae
      
      ,
      it may due to some "redirty_tail() after pages_skipped" effect which
      is by no means a guarantee for _all_ the file systems.
      
      Note that writeback_inodes_sb() is called by not only sync(), they are
      treated the same because the other callers also need livelock prevention.
      
      Impact:  It changes the order in which pages/inodes are synced to disk.
      Now in the WB_SYNC_NONE stage, it won't proceed to write the next inode
      until finished with the current inode.
      
      Acked-by: default avatarJan Kara <jack@suse.cz>
      CC: Dave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      6e6938b6
  2. Jun 06, 2011
  3. Jun 05, 2011
  4. Jun 04, 2011
  5. Jun 03, 2011