Commit 484167da authored by Qu Wenruo's avatar Qu Wenruo Committed by David Sterba
Browse files

btrfs: defrag: fix wrong number of defragged sectors



[BUG]
There are users using autodefrag mount option reporting obvious increase
in IO:

> If I compare the write average (in total, I don't have it per process)
> when taking idle periods on the same machine:
>     Linux 5.16:
>         without autodefrag: ~ 10KiB/s
>         with autodefrag: between 1 and 2MiB/s.
>
>     Linux 5.15:
>         with autodefrag:~ 10KiB/s (around the same as without
> autodefrag on 5.16)

[CAUSE]
When autodefrag mount option is enabled, btrfs_defrag_file() will be
called with @max_sectors = BTRFS_DEFRAG_BATCH (1024) to limit how many
sectors we can defrag in one try.

And then use the number of sectors defragged to determine if we need to
re-defrag.

But commit b18c3ab2 ("btrfs: defrag: introduce helper to defrag one
cluster") uses wrong unit to increase @sectors_defragged, which should
be in unit of sector, not byte.

This means, if we have defragged any sector, then @sectors_defragged
will be >= sectorsize (normally 4096), which is larger than
BTRFS_DEFRAG_BATCH.

This makes the @max_sectors check in defrag_one_cluster() to underflow,
rendering the whole @max_sectors check useless.

Thus causing way more IO for autodefrag mount options, as now there is
no limit on how many sectors can really be defragged.

[FIX]
Fix the problems by:

- Use sector as unit when increasing @sectors_defragged

- Include @sectors_defragged > @max_sectors case to break the loop

- Add extra comment on the return value of btrfs_defrag_file()

Reported-by: default avatarAnthony Ruhier <aruhier@mailbox.org>
Fixes: b18c3ab2 ("btrfs: defrag: introduce helper to defrag one cluster")
Link: https://lore.kernel.org/linux-btrfs/0a269612-e43f-da22-c5bc-b34b1b56ebe8@mailbox.org/


CC: stable@vger.kernel.org # 5.16
Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent b767c2fc
Loading
Loading
Loading
Loading
+7 −3
Original line number Diff line number Diff line
@@ -1442,8 +1442,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
	list_for_each_entry(entry, &target_list, list) {
		u32 range_len = entry->len;

		/* Reached the limit */
		if (max_sectors && max_sectors == *sectors_defragged)
		/* Reached or beyond the limit */
		if (max_sectors && *sectors_defragged >= max_sectors)
			break;

		if (max_sectors)
@@ -1465,7 +1465,8 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
				       extent_thresh, newer_than, do_compress);
		if (ret < 0)
			break;
		*sectors_defragged += range_len;
		*sectors_defragged += range_len >>
				      inode->root->fs_info->sectorsize_bits;
	}
out:
	list_for_each_entry_safe(entry, tmp, &target_list, list) {
@@ -1484,6 +1485,9 @@ static int defrag_one_cluster(struct btrfs_inode *inode,
 * @newer_than:	   minimum transid to defrag
 * @max_to_defrag: max number of sectors to be defragged, if 0, the whole inode
 *		   will be defragged.
 *
 * Return <0 for error.
 * Return >=0 for the number of sectors defragged.
 */
int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra,
		      struct btrfs_ioctl_defrag_range_args *range,