Commit c88fc882 authored Nov 20, 2024 by Chi Zhiling Committed by Jiangshan Yi Nov 20, 2024

ocfs2: fix unexpected zeroing of virtual disk

mainline inclusion
from mainline-v6.12-rc1
commit 03222db82a3a0db43cbad00886c800819fdc59f3
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IB5LRI

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=03222db82a3a0db43cbad00886c800819fdc59f3

--------------------------------

In a guest virtual machine, we found that there is unexpected data zeroing
problem detected occassionly:

XFS (vdb): Mounting V5 Filesystem
XFS (vdb): Ending clean mount
XFS (vdb): Metadata CRC error detected at xfs_refcountbt_read_verify+0x2c/0xf0, xfs_refcountbt block 0x200028
XFS (vdb): Unmount and run xfs_repair
XFS (vdb): First 128 bytes of corrupted metadata buffer:
00000000e0cd2f5e: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000cafd57f5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000d0298d7d: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000f0698484: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000adb789a7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000005292b878: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000885b4700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000000fd4b4df7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
XFS (vdb): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x200028 len 8 error 74
XFS (vdb): Error -117 recovering leftover CoW allocations.
XFS (vdb): xfs_do_force_shutdown(0x8) called from line 994 of file fs/xfs/xfs_mount.c.  Return address = 000000003a53523a
XFS (vdb): Corruption of in-memory data detected.  Shutting down filesystem
XFS (vdb): Please umount the filesystem and rectify the problem(s)

It turns out that the root cause is from the physical host machine.  More
specifically, it is caused by the ocfs2.

when the page_size is 64k, the block should advance by 16 each time
instead of 1.  This will lead to a wrong mapping from the page to the
disk, which will zero some adjacent part of the disk.

Link: https://lkml.kernel.org/r/20240815092141.1223238-1-chizhiling@163.com


Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Suggested-by: Shida Zhang <zhangshida@kylinos.cn>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>

parent 76e29fd0

fs/ocfs2/aops.c

+1 −1

Original line number	Diff line number	Diff line
		@@ -1188,7 +1188,7 @@ static int ocfs2_write_cluster(struct address_space *mapping,

		/* This is the direct io target page. */
		if (wc->w_pages[i] == NULL) {
		p_blkno++;
		p_blkno += (1 << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits));
		continue;
		}