Skip to content
  1. Dec 21, 2012
    • David Howells's avatar
      FS-Cache: Fix operation state management and accounting · 9f10523f
      David Howells authored
      
      
      Fix the state management of internal fscache operations and the accounting of
      what operations are in what states.
      
      This is done by:
      
       (1) Give struct fscache_operation a enum variable that directly represents the
           state it's currently in, rather than spreading this knowledge over a bunch
           of flags, who's processing the operation at the moment and whether it is
           queued or not.
      
           This makes it easier to write assertions to check the state at various
           points and to prevent invalid state transitions.
      
       (2) Add an 'operation complete' state and supply a function to indicate the
           completion of an operation (fscache_op_complete()) and make things call
           it.  The final call to fscache_put_operation() can then check that an op
           in the appropriate state (complete or cancelled).
      
       (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
           govern the state of an object:
      
      	(a) The ->n_ops is now the number of extant operations on the object
      	    and is now decremented by fscache_put_operation() only.
      
      	(b) The ->n_in_progress is simply the number of objects that have been
      	    taken off of the object's pending queue for the purposes of being
      	    run.  This is decremented by fscache_op_complete() only.
      
      	(c) The ->n_exclusive is the number of exclusive ops that have been
      	    submitted and queued or are in progress.  It is decremented by
      	    fscache_op_complete() and by fscache_cancel_op().
      
           fscache_put_operation() and fscache_operation_gc() now no longer try to
           clean up ->n_exclusive and ->n_in_progress.  That was leading to double
           decrements against fscache_cancel_op().
      
           fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
           double decrements against fscache_put_operation().
      
           fscache_submit_exclusive_op() now decides whether it has to queue an op
           based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
           will persist in being true even after all preceding operations have been
           cancelled or completed.  Furthermore, if an object is active and there are
           runnable ops against it, there must be at least one op running.
      
       (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
           provide a function to record completion of the pages as they complete.
      
           When n_pages reaches 0, the operation is deemed to be complete and
           fscache_op_complete() is called.
      
           Add calls to fscache_retrieval_complete() anywhere we've finished with a
           page we've been given to read or allocate for.  This includes places where
           we just return pages to the netfs for reading from the server and where
           accessing the cache fails and we discard the proposed netfs page.
      
      The bugs in the unfixed state management manifest themselves as oopses like the
      following where the operation completion gets out of sync with return of the
      cookie by the netfs.  This is possible because the cache unlocks and returns
      all the netfs pages before recording its completion - which means that there's
      nothing to stop the netfs discarding them and returning the cookie.
      
      
      FS-Cache: Cookie 'NFS.fh' still has outstanding reads
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/cookie.c:519!
      invalid opcode: 0000 [#1] SMP
      CPU 1
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      
      Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
      RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
      RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
      RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
      RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
      RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
      R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
      R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
      FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
      Stack:
       ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
       ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
       ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
      Call Trace:
       [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
       [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
       [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
       [<ffffffff810d8d47>] evict+0xa1/0x15c
       [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
       [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
       [<ffffffff810c56b7>] prune_super+0xd5/0x140
       [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
       [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
       [<ffffffff8103e009>] ? process_timeout+0xb/0xb
       [<ffffffff8109dba3>] kswapd+0x270/0x289
       [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
       [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
       [<ffffffff8104bf7a>] kthread+0x7f/0x87
       [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
       [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
       [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
       [<ffffffff813ad6b0>] ? gs_change+0xb/0xb
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      9f10523f
    • David Howells's avatar
      FS-Cache: Make cookie relinquishment wait for outstanding reads · ef46ed88
      David Howells authored
      
      
      Make fscache_relinquish_cookie() log a warning and wait if there are any
      outstanding reads left on the cookie it was given.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      ef46ed88
    • David Howells's avatar
      CacheFiles: Make some debugging statements conditional · 37491a13
      David Howells authored
      
      
      Downgrade some debugging statements to not unconditionally print stuff, but
      rather be conditional on the appropriate module parameter setting.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      37491a13
    • David Howells's avatar
      FS-Cache: Check that there are no read ops when cookie relinquished · 0f972b56
      David Howells authored
      
      
      Check that the netfs isn't trying to relinquish a cookie that still has read
      operations in progress upon it.  If there are, then give log a warning and BUG.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      0f972b56
    • David Howells's avatar
      CacheFiles: Downgrade the requirements passed to the allocator · 5f4f9f4a
      David Howells authored
      
      
      Downgrade the requirements passed to the allocator in the gfp flags parameter.
      FS-Cache/CacheFiles can handle OOM conditions simply by aborting the attempt to
      store an object or a page in the cache.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5f4f9f4a
    • David Howells's avatar
      CacheFiles: Fix the marking of cached pages · c4d6d8db
      David Howells authored
      
      
      Under some circumstances CacheFiles defers the marking of pages with PG_fscache
      so that it can take advantage of pagevecs to reduce the number of calls to
      fscache_mark_pages_cached() and the netfs's hook to keep track of this.
      
      There are, however, two problems with this:
      
       (1) It can lead to the PG_fscache mark being applied _after_ the page is set
           PG_uptodate and unlocked (by the call to fscache_end_io()).
      
       (2) CacheFiles's ref on the page is dropped immediately following
           fscache_end_io() - and so may not still be held when the mark is applied.
           This can lead to the page being passed back to the allocator before the
           mark is applied.
      
      Fix this by, where appropriate, marking the page before calling
      fscache_end_io() and releasing the page.  This means that we can't take
      advantage of pagevecs and have to make a separate call for each page to the
      marking routines.
      
      The symptoms of this are Bad Page state errors cropping up under memory
      pressure, for example:
      
      BUG: Bad page state in process tar  pfn:002da
      page:ffffea0000009fb0 count:0 mapcount:0 mapping:          (null) index:0x1447
      page flags: 0x1000(private_2)
      Pid: 4574, comm: tar Tainted: G        W   3.1.0-rc4-fsdevel+ #1064
      Call Trace:
       [<ffffffff8109583c>] ? dump_page+0xb9/0xbe
       [<ffffffff81095916>] bad_page+0xd5/0xea
       [<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
       [<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
       [<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
       [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
       [<ffffffff81098d7b>] ra_submit+0x1c/0x20
       [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
       [<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
       [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
       [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
       [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
       [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
       [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
       [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
       [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
       [<ffffffff810c2a79>] sys_read+0x45/0x6c
       [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
      
      As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.
      
      Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
      set appropriately showed that sometimes it wasn't.  This led to the discovery
      that sometimes the page has apparently been reclaimed by the time the marker
      got to see it.
      
      Reported-by: default avatarM. Stevens <m@tippett.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      c4d6d8db
    • Vaibhav Bedia's avatar
      ARM: OMAP: Fix build breakage due to missing include in i2c.c · 18000985
      Vaibhav Bedia authored
      Merge commit 752451f0 ("Merge branch 'i2c-embedded/for-next' of
      git://git.pengutronix.de/git/wsa/linux"
      
      ) resulted in a build breakage
      for OMAP
      
        arch/arm/mach-omap2/i2c.c: In function 'omap_pm_set_max_mpu_wakeup_lat_compat':
        arch/arm/mach-omap2/i2c.c:130:2: error: implicit declaration of function 'omap_pm_set_max_mpu_wakeup_lat'
        make[1]: *** [arch/arm/mach-omap2/i2c.o] Error 1
      
      Fix this by including the appropriate header file with the function
      prototype.
      
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarVaibhav Bedia <vaibhav.bedia@ti.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18000985
    • Linus Torvalds's avatar
      Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · b7dfde95
      Linus Torvalds authored
      Pull virtio update from Rusty Russell:
       "Some nice cleanups, and even a patch my wife did as a "live" demo for
        Latinoware 2012.
      
        There's a slightly non-trivial merge in virtio-net, as we cleaned up
        the virtio add_buf interface while DaveM accepted the mq virtio-net
        patches."
      
      * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
        virtio_console: Add support for remoteproc serial
        virtio_console: Merge struct buffer_token into struct port_buffer
        virtio: add drv_to_virtio to make code clearly
        virtio: use dev_to_virtio wrapper in virtio
        virtio-mmio: Fix irq parsing in command line parameter
        virtio_console: Free buffers from out-queue upon close
        virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
        virtio_console: Use kmalloc instead of kzalloc
        virtio_console: Free buffer if splice fails
        virtio: tools: make it clear that virtqueue_add_buf...
      b7dfde95
  2. Dec 20, 2012