Loading Documentation/filesystems/Locking +12 −10 Original line number Diff line number Diff line Loading @@ -92,8 +92,8 @@ prototypes: void (*destroy_inode)(struct inode *); void (*dirty_inode) (struct inode *); int (*write_inode) (struct inode *, int); void (*drop_inode) (struct inode *); void (*delete_inode) (struct inode *); int (*drop_inode) (struct inode *); void (*evict_inode) (struct inode *); void (*put_super) (struct super_block *); void (*write_super) (struct super_block *); int (*sync_fs)(struct super_block *sb, int wait); Loading @@ -101,14 +101,13 @@ prototypes: int (*unfreeze_fs) (struct super_block *); int (*statfs) (struct dentry *, struct kstatfs *); int (*remount_fs) (struct super_block *, int *, char *); void (*clear_inode) (struct inode *); void (*umount_begin) (struct super_block *); int (*show_options)(struct seq_file *, struct vfsmount *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); locking rules: All may block. All may block [not true, see below] None have BKL s_umount alloc_inode: Loading @@ -116,22 +115,25 @@ destroy_inode: dirty_inode: (must not sleep) write_inode: drop_inode: !!!inode_lock!!! delete_inode: evict_inode: put_super: write write_super: read sync_fs: read freeze_fs: read unfreeze_fs: read statfs: no remount_fs: maybe (see below) clear_inode: statfs: maybe(read) (see below) remount_fs: write umount_begin: no show_options: no (namespace_sem) quota_read: no (see below) quota_write: no (see below) ->remount_fs() will have the s_umount exclusive lock if it's already mounted. When called from get_sb_single, it does NOT have the s_umount lock. ->statfs() has s_umount (shared) when called by ustat(2) (native or compat), but that's an accident of bad API; s_umount is used to pin the superblock down when we only have dev_t given us by userland to identify the superblock. Everything else (statfs(), fstatfs(), etc.) doesn't hold it when calling ->statfs() - superblock is pinned down by resolving the pathname passed to syscall. ->quota_read() and ->quota_write() functions are both guaranteed to be the only ones operating on the quota file by the quota code (via dqio_sem) (unless an admin really wants to screw up something and Loading Documentation/filesystems/porting +27 −0 Original line number Diff line number Diff line Loading @@ -291,3 +291,30 @@ be in order of zeroing blocks using block_truncate_page or similar helpers, size update and on finally on-disk truncation which should not fail. inode_change_ok now includes the size checks for ATTR_SIZE and must be called in the beginning of ->setattr unconditionally. [mandatory] ->clear_inode() and ->delete_inode() are gone; ->evict_inode() should be used instead. It gets called whenever the inode is evicted, whether it has remaining links or not. Caller does *not* evict the pagecache or inode-associated metadata buffers; getting rid of those is responsibility of method, as it had been for ->delete_inode(). ->drop_inode() returns int now; it's called on final iput() with inode_lock held and it returns true if filesystems wants the inode to be dropped. As before, generic_drop_inode() is still the default and it's been updated appropriately. generic_delete_inode() is also alive and it consists simply of return 1. Note that all actual eviction work is done by caller after ->drop_inode() returns. clear_inode() is gone; use end_writeback() instead. As before, it must be called exactly once on each call of ->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike before, if you are using inode-associated metadata buffers (i.e. mark_buffer_dirty_inode()), it's your responsibility to call invalidate_inode_buffers() before end_writeback(). No async writeback (and thus no calls of ->write_inode()) will happen after end_writeback() returns, so actions that should not overlap with ->write_inode() (e.g. freeing on-disk inode if i_nlink is 0) ought to be done after that call. NOTE: checking i_nlink in the beginning of ->write_inode() and bailing out if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput() may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly free the on-disk inode, you may end up doing that while ->write_inode() is writing to it. Loading
Documentation/filesystems/Locking +12 −10 Original line number Diff line number Diff line Loading @@ -92,8 +92,8 @@ prototypes: void (*destroy_inode)(struct inode *); void (*dirty_inode) (struct inode *); int (*write_inode) (struct inode *, int); void (*drop_inode) (struct inode *); void (*delete_inode) (struct inode *); int (*drop_inode) (struct inode *); void (*evict_inode) (struct inode *); void (*put_super) (struct super_block *); void (*write_super) (struct super_block *); int (*sync_fs)(struct super_block *sb, int wait); Loading @@ -101,14 +101,13 @@ prototypes: int (*unfreeze_fs) (struct super_block *); int (*statfs) (struct dentry *, struct kstatfs *); int (*remount_fs) (struct super_block *, int *, char *); void (*clear_inode) (struct inode *); void (*umount_begin) (struct super_block *); int (*show_options)(struct seq_file *, struct vfsmount *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); locking rules: All may block. All may block [not true, see below] None have BKL s_umount alloc_inode: Loading @@ -116,22 +115,25 @@ destroy_inode: dirty_inode: (must not sleep) write_inode: drop_inode: !!!inode_lock!!! delete_inode: evict_inode: put_super: write write_super: read sync_fs: read freeze_fs: read unfreeze_fs: read statfs: no remount_fs: maybe (see below) clear_inode: statfs: maybe(read) (see below) remount_fs: write umount_begin: no show_options: no (namespace_sem) quota_read: no (see below) quota_write: no (see below) ->remount_fs() will have the s_umount exclusive lock if it's already mounted. When called from get_sb_single, it does NOT have the s_umount lock. ->statfs() has s_umount (shared) when called by ustat(2) (native or compat), but that's an accident of bad API; s_umount is used to pin the superblock down when we only have dev_t given us by userland to identify the superblock. Everything else (statfs(), fstatfs(), etc.) doesn't hold it when calling ->statfs() - superblock is pinned down by resolving the pathname passed to syscall. ->quota_read() and ->quota_write() functions are both guaranteed to be the only ones operating on the quota file by the quota code (via dqio_sem) (unless an admin really wants to screw up something and Loading
Documentation/filesystems/porting +27 −0 Original line number Diff line number Diff line Loading @@ -291,3 +291,30 @@ be in order of zeroing blocks using block_truncate_page or similar helpers, size update and on finally on-disk truncation which should not fail. inode_change_ok now includes the size checks for ATTR_SIZE and must be called in the beginning of ->setattr unconditionally. [mandatory] ->clear_inode() and ->delete_inode() are gone; ->evict_inode() should be used instead. It gets called whenever the inode is evicted, whether it has remaining links or not. Caller does *not* evict the pagecache or inode-associated metadata buffers; getting rid of those is responsibility of method, as it had been for ->delete_inode(). ->drop_inode() returns int now; it's called on final iput() with inode_lock held and it returns true if filesystems wants the inode to be dropped. As before, generic_drop_inode() is still the default and it's been updated appropriately. generic_delete_inode() is also alive and it consists simply of return 1. Note that all actual eviction work is done by caller after ->drop_inode() returns. clear_inode() is gone; use end_writeback() instead. As before, it must be called exactly once on each call of ->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike before, if you are using inode-associated metadata buffers (i.e. mark_buffer_dirty_inode()), it's your responsibility to call invalidate_inode_buffers() before end_writeback(). No async writeback (and thus no calls of ->write_inode()) will happen after end_writeback() returns, so actions that should not overlap with ->write_inode() (e.g. freeing on-disk inode if i_nlink is 0) ought to be done after that call. NOTE: checking i_nlink in the beginning of ->write_inode() and bailing out if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput() may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly free the on-disk inode, you may end up doing that while ->write_inode() is writing to it.