Commit 12a5c9f8 authored by yangerkun's avatar yangerkun Committed by Long Li
Browse files

xfs: fix xfs shutdown since we reserve more blocks in agfl fixup

hulk inclusion
category: bugfix
bugzilla: 188788, https://gitee.com/openeuler/kernel/issues/I76JSK
CVE: NA

--------------------------------

Twice fixup for the same ag may happen within exact one tp, and the
consume of agfl after first fixup may trigger failure of second fixup,
which is a unintended behavior and then xfs shutdown[1][2].

Gao Xiang describe one solution that we can reserve more blocks when first
fixup, but there is some logical error:

- we may first see postallocs as 1 and second as 0, this can trigger
  pointless agfl filling or shortening
- upper case(postallocs first equals to 1, second equals to 0) give us
  examples that we need shorten the agfl, but xfs_alloc_fix_freelist can
  only free agfl after success freespace check. Besides, the filling or
  shortening of agfl won't change fdblocks, so we can fall into that we
  can see fdblocks(or resblocks) but ag fixup will reject us, and then
  xfs can shutdown too
- once postallocs equals to 1, it can also change the logical of
  xfs_alloc_ag_max_usable, which will change the block allocation
  logical(found this problem by check each ag's freeblocks after
  we fallocate a huge file)
- once postallocs equals to 1, we reserve 2 * xfs_alloc_min_freelist(),
  but sometimes it seems not enough once bnt/cnt grow and the second fixup
  need more reserve...

This patch fix all bug above by using m_ag_maxlevels to reserve more
blocks, and adapt xfs_alloc_set_aside/xfs_alloc_ag_max_usable to match
this more reserve. Besides, we just reserve more, won't fill or shorten
agfl according to that reserve.

[1] https://www.spinics.net/lists/linux-xfs/msg66440.html
[2] https://lore.kernel.org/linux-xfs/20221228133204.4021519-1-guoxuenan@huawei.com/



Fixes: 53f85096 ("xfs: account extra freespace btree splits for multiple allocations")
Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
Signed-off-by: default avatarLong Li <leo.lilong@huawei.com>
parent cb560841
Loading
Loading
Loading
Loading
+36 −5
Original line number Diff line number Diff line
@@ -81,6 +81,25 @@ xfs_prealloc_blocks(
	return XFS_IBT_BLOCK(mp) + 1;
}

/*
 * Twice fixup for the same ag may happen within exact one tp, and the consume
 * of agfl after first fixup may trigger second fixup's failure, then xfs will
 * shutdown. To avoid that, we reserve blocks which can satisfy the second
 * fixup.
 */
xfs_extlen_t
xfs_ag_fixup_aside(
	struct xfs_mount	*mp)
{
	xfs_extlen_t ret;

	ret = 2 * mp->m_ag_maxlevels;
	if (xfs_has_rmapbt(mp))
		ret += mp->m_rmap_maxlevels;

	return ret;
}

/*
 * In order to avoid ENOSPC-related deadlock caused by out-of-order locking of
 * AGF buffer (PV 947395), we place constraints on the relationship among
@@ -95,12 +114,15 @@ xfs_prealloc_blocks(
 *
 * We need to reserve 4 fsbs _per AG_ for the freelist and 4 more to handle a
 * potential split of the file's bmap btree.
 *
 * Besides, comment for xfs_ag_fixup_aside show why we reserve more blocks.
 */
unsigned int
xfs_alloc_set_aside(
	struct xfs_mount	*mp)
{
	return mp->m_sb.sb_agcount * (XFS_ALLOC_AGFL_RESERVE + 4);
	return mp->m_sb.sb_agcount * (XFS_ALLOC_AGFL_RESERVE +
			4 + xfs_ag_fixup_aside(mp));
}

/*
@@ -133,6 +155,8 @@ xfs_alloc_ag_max_usable(
	if (xfs_has_reflink(mp))
		blocks++;		/* refcount root block */

	blocks += xfs_ag_fixup_aside(mp);

	return mp->m_sb.sb_agblocks - blocks;
}

@@ -2591,6 +2615,7 @@ xfs_alloc_fix_freelist(
	struct xfs_alloc_arg	targs;	/* local allocation arguments */
	xfs_agblock_t		bno;	/* freelist block */
	xfs_extlen_t		need;	/* total blocks needed in freelist */
	xfs_extlen_t		minfree;
	int			error = 0;

	/* deferred ops (AGFL block frees) require permanent transactions */
@@ -2622,8 +2647,11 @@ xfs_alloc_fix_freelist(
	 * blocks to perform multiple allocations from a single AG and
	 * transaction if needed.
	 */
	need = xfs_alloc_min_freelist(mp, pag) * (1 + args->postallocs);
	if (!xfs_alloc_space_available(args, need, alloc_flags |
	minfree = need = xfs_alloc_min_freelist(mp, pag);
	if (args->postallocs)
		minfree += xfs_ag_fixup_aside(mp);

	if (!xfs_alloc_space_available(args, minfree, alloc_flags |
			XFS_ALLOC_FLAG_CHECK))
		goto out_agbp_relse;

@@ -2646,8 +2674,11 @@ xfs_alloc_fix_freelist(
		xfs_agfl_reset(tp, agbp, pag);

	/* If there isn't enough total space or single-extent, reject it. */
	need = xfs_alloc_min_freelist(mp, pag) * (1 + args->postallocs);
	if (!xfs_alloc_space_available(args, need, alloc_flags))
	minfree = need = xfs_alloc_min_freelist(mp, pag);
	if (args->postallocs)
		minfree += xfs_ag_fixup_aside(mp);

	if (!xfs_alloc_space_available(args, minfree, alloc_flags))
		goto out_agbp_relse;

	/*
+9 −0
Original line number Diff line number Diff line
@@ -778,6 +778,15 @@ xfs_mountfs(
	xfs_rmapbt_compute_maxlevels(mp);
	xfs_refcountbt_compute_maxlevels(mp);

	/*
	 * We now need m_ag_maxlevels/m_rmap_maxlevels to initialize
	 * m_alloc_set_aside/m_ag_max_usable. And when we first do the
	 * init in xfs_sb_mount_common, m_alloc_set_aside/m_ag_max_usable
	 * still equals to 0. Redo it now.
	 */
	mp->m_alloc_set_aside = xfs_alloc_set_aside(mp);
	mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp);

	/*
	 * Check if sb_agblocks is aligned at stripe boundary.  If sb_agblocks
	 * is NOT aligned turn off m_dalign since allocator alignment is within