Commit 82708bb1 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull btrfs fixes from David Sterba:

 - zoned relocation fixes:
      - fix critical section end for extent writeback, this could lead
        to out of order write
      - prevent writing to previous data relocation block group if space
        gets low

 - reflink fixes:
      - fix race between reflinking and ordered extent completion
      - proper error handling when block reserve migration fails
      - add missing inode iversion/mtime/ctime updates on each iteration
        when replacing extents

 - fix deadlock when running fsync/fiemap/commit at the same time

 - fix false-positive KCSAN report regarding pid tracking for read locks
   and data race

 - minor documentation update and link to new site

* tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Documentation: update btrfs list of features and link to readthedocs.io
  btrfs: fix deadlock with fsync+fiemap+transaction commit
  btrfs: don't set lock_owner when locking extent buffer for reading
  btrfs: zoned: fix critical section of relocation inode writeback
  btrfs: zoned: prevent allocation from previous data relocation BG
  btrfs: do not BUG_ON() on failure to migrate space when replacing extents
  btrfs: add missing inode updates on each iteration when replacing extents
  btrfs: fix race between reflinking and ordered extent completion
parents c898c67d 037e1274
Loading
Loading
Loading
Loading
+13 −3
Original line number Diff line number Diff line
@@ -19,13 +19,23 @@ The main Btrfs features include:
    * Subvolumes (separate internal filesystem roots)
    * Object level mirroring and striping
    * Checksums on data and metadata (multiple algorithms available)
    * Compression
    * Compression (multiple algorithms available)
    * Reflink, deduplication
    * Scrub (on-line checksum verification)
    * Hierarchical quota groups (subvolume and snapshot support)
    * Integrated multiple device support, with several raid algorithms
    * Offline filesystem check
    * Efficient incremental backup and FS mirroring
    * Efficient incremental backup and FS mirroring (send/receive)
    * Trim/discard
    * Online filesystem defragmentation
    * Swapfile support
    * Zoned mode
    * Read/write metadata verification
    * Online resize (shrink, grow)

For more information please refer to the wiki
For more information please refer to the documentation site or wiki

  https://btrfs.readthedocs.io

  https://btrfs.wiki.kernel.org

+1 −0
Original line number Diff line number Diff line
@@ -104,6 +104,7 @@ struct btrfs_block_group {
	unsigned int relocating_repair:1;
	unsigned int chunk_item_inserted:1;
	unsigned int zone_is_active:1;
	unsigned int zoned_data_reloc_ongoing:1;

	int disk_cache_state;

+2 −0
Original line number Diff line number Diff line
@@ -1330,6 +1330,8 @@ struct btrfs_replace_extent_info {
	 * existing extent into a file range.
	 */
	bool is_new_extent;
	/* Indicate if we should update the inode's mtime and ctime. */
	bool update_times;
	/* Meaningful only if is_new_extent is true. */
	int qgroup_reserved;
	/*
+18 −2
Original line number Diff line number Diff line
@@ -3832,7 +3832,7 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
	       block_group->start == fs_info->data_reloc_bg ||
	       fs_info->data_reloc_bg == 0);

	if (block_group->ro) {
	if (block_group->ro || block_group->zoned_data_reloc_ongoing) {
		ret = 1;
		goto out;
	}
@@ -3894,8 +3894,24 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
out:
	if (ret && ffe_ctl->for_treelog)
		fs_info->treelog_bg = 0;
	if (ret && ffe_ctl->for_data_reloc)
	if (ret && ffe_ctl->for_data_reloc &&
	    fs_info->data_reloc_bg == block_group->start) {
		/*
		 * Do not allow further allocations from this block group.
		 * Compared to increasing the ->ro, setting the
		 * ->zoned_data_reloc_ongoing flag still allows nocow
		 *  writers to come in. See btrfs_inc_nocow_writers().
		 *
		 * We need to disable an allocation to avoid an allocation of
		 * regular (non-relocation data) extent. With mix of relocation
		 * extents and regular extents, we can dispatch WRITE commands
		 * (for relocation extents) and ZONE APPEND commands (for
		 * regular extents) at the same time to the same zone, which
		 * easily break the write pointer.
		 */
		block_group->zoned_data_reloc_ongoing = 1;
		fs_info->data_reloc_bg = 0;
	}
	spin_unlock(&fs_info->relocation_bg_lock);
	spin_unlock(&fs_info->treelog_bg_lock);
	spin_unlock(&block_group->lock);
+2 −1
Original line number Diff line number Diff line
@@ -5241,13 +5241,14 @@ int extent_writepages(struct address_space *mapping,
	 */
	btrfs_zoned_data_reloc_lock(BTRFS_I(inode));
	ret = extent_write_cache_pages(mapping, wbc, &epd);
	btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
	ASSERT(ret <= 0);
	if (ret < 0) {
		btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
		end_write_bio(&epd, ret);
		return ret;
	}
	flush_write_bio(&epd);
	btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
	return ret;
}

Loading