Commit 7aca5ca1 authored by David Hildenbrand's avatar David Hildenbrand Committed by Andrew Morton
Browse files

selftests/vm: anon_cow: prepare for non-anonymous COW tests

Patch series "mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O
long-term pinning)".

For now, we did not support reliable R/O long-term pinning in COW
mappings.  That means, if we would trigger R/O long-term pinning in
MAP_PRIVATE mapping, we could end up pinning the (R/O-mapped) shared
zeropage or a pagecache page.

The next write access would trigger a write fault and replace the pinned
page by an exclusive anonymous page in the process page table; whatever
the process would write to that private page copy would not be visible by
the owner of the previous page pin: for example, RDMA could read stale
data.  The end result is essentially an unexpected and hard-to-debug
memory corruption.

Some drivers tried working around that limitation by using
"FOLL_FORCE|FOLL_WRITE|FOLL_LONGTERM" for R/O long-term pinning for now. 
FOLL_WRITE would trigger a write fault, if required, and break COW before
pinning the page.  FOLL_FORCE is required because the VMA might lack write
permissions, and drivers wanted to make that working as well, just like
one would expect (no write access, but still triggering a write access to
break COW).

However, that is not a practical solution, because
(1) Drivers that don't stick to that undocumented and debatable pattern
    would still run into that issue. For example, VFIO only uses
    FOLL_LONGTERM for R/O long-term pinning.
(2) Using FOLL_WRITE just to work around a COW mapping + page pinning
    limitation is unintuitive. FOLL_WRITE would, for example, mark the
    page softdirty or trigger uffd-wp, even though, there actually isn't
    going to be any write access.
(3) The purpose of FOLL_FORCE is debug access, not access without lack of
    VMA permissions by arbitrarty drivers.

So instead, make R/O long-term pinning work as expected, by breaking COW
in a COW mapping early, such that we can remove any FOLL_FORCE usage from
drivers and make FOLL_FORCE ptrace-specific (renaming it to FOLL_PTRACE).
More details in patch #8.


This patch (of 19):

Originally, the plan was to have a separate tests for testing COW of
non-anonymous (e.g., shared zeropage) pages.

Turns out, that we'd need a lot of similar functionality and that there
isn't a really good reason to separate it. So let's prepare for non-anon
tests by renaming to "cow".

Link: https://lkml.kernel.org/r/20221116102659.70287-1-david@redhat.com
Link: https://lkml.kernel.org/r/20221116102659.70287-2-david@redhat.com


Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Walls <awalls@md.metrocast.net>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Benvenuti <benve@cisco.com>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nelson Escobar <neescoba@cisco.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 74947724
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
# SPDX-License-Identifier: GPL-2.0-only
anon_cow
cow
hugepage-mmap
hugepage-mremap
hugepage-shm
+5 −5
Original line number Diff line number Diff line
@@ -27,7 +27,7 @@ MAKEFLAGS += --no-builtin-rules

CFLAGS = -Wall -I $(top_srcdir) -I $(top_srcdir)/usr/include $(EXTRA_CFLAGS) $(KHDR_INCLUDES)
LDLIBS = -lrt -lpthread
TEST_GEN_FILES = anon_cow
TEST_GEN_FILES = cow
TEST_GEN_FILES += compaction_test
TEST_GEN_FILES += gup_test
TEST_GEN_FILES += hmm-tests
@@ -98,7 +98,7 @@ TEST_FILES += va_128TBswitch.sh

include ../lib.mk

$(OUTPUT)/anon_cow: vm_util.c
$(OUTPUT)/cow: vm_util.c
$(OUTPUT)/khugepaged: vm_util.c
$(OUTPUT)/madv_populate: vm_util.c
$(OUTPUT)/soft-dirty: vm_util.c
@@ -154,8 +154,8 @@ warn_32bit_failure:
endif
endif

# ANON_COW_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
$(OUTPUT)/anon_cow: LDLIBS += $(ANON_COW_EXTRA_LIBS)
# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)

$(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap

@@ -168,7 +168,7 @@ local_config.mk local_config.h: check_config.sh

EXTRA_CLEAN += local_config.mk local_config.h

ifeq ($(ANON_COW_EXTRA_LIBS),)
ifeq ($(COW_EXTRA_LIBS),)
all: warn_missing_liburing

warn_missing_liburing:
+2 −2
Original line number Diff line number Diff line
@@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1

if [ -f $tmpfile_o ]; then
    echo "#define LOCAL_CONFIG_HAVE_LIBURING 1"  > $OUTPUT_H_FILE
    echo "ANON_COW_EXTRA_LIBS = -luring"         > $OUTPUT_MKFILE
    echo "COW_EXTRA_LIBS = -luring"              > $OUTPUT_MKFILE
else
    echo "// No liburing support found"          > $OUTPUT_H_FILE
    echo "# No liburing support found, so:"      > $OUTPUT_MKFILE
    echo "ANON_COW_EXTRA_LIBS = "               >> $OUTPUT_MKFILE
    echo "COW_EXTRA_LIBS = "                    >> $OUTPUT_MKFILE
fi

rm ${tmpname}.*
+15 −10
Original line number Diff line number Diff line
// SPDX-License-Identifier: GPL-2.0-only
/*
 * COW (Copy On Write) tests for anonymous memory.
 * COW (Copy On Write) tests.
 *
 * Copyright 2022, Red Hat, Inc.
 *
@@ -986,7 +986,11 @@ struct test_case {
	test_fn fn;
};

static const struct test_case test_cases[] = {
/*
 * Test cases that are specific to anonymous pages: pages in private mappings
 * that may get shared via COW during fork().
 */
static const struct test_case anon_test_cases[] = {
	/*
	 * Basic COW tests for fork() without any GUP. If we miss to break COW,
	 * either the child can observe modifications by the parent or the
@@ -1104,7 +1108,7 @@ static const struct test_case test_cases[] = {
	},
};

static void run_test_case(struct test_case const *test_case)
static void run_anon_test_case(struct test_case const *test_case)
{
	int i;

@@ -1125,15 +1129,17 @@ static void run_test_case(struct test_case const *test_case)
				 hugetlbsizes[i]);
}

static void run_test_cases(void)
static void run_anon_test_cases(void)
{
	int i;

	for (i = 0; i < ARRAY_SIZE(test_cases); i++)
		run_test_case(&test_cases[i]);
	ksft_print_msg("[INFO] Anonymous memory tests in private mappings\n");

	for (i = 0; i < ARRAY_SIZE(anon_test_cases); i++)
		run_anon_test_case(&anon_test_cases[i]);
}

static int tests_per_test_case(void)
static int tests_per_anon_test_case(void)
{
	int tests = 2 + nr_hugetlbsizes;

@@ -1144,7 +1150,6 @@ static int tests_per_test_case(void)

int main(int argc, char **argv)
{
	int nr_test_cases = ARRAY_SIZE(test_cases);
	int err;

	pagesize = getpagesize();
@@ -1152,14 +1157,14 @@ int main(int argc, char **argv)
	detect_hugetlbsizes();

	ksft_print_header();
	ksft_set_plan(nr_test_cases * tests_per_test_case());
	ksft_set_plan(ARRAY_SIZE(anon_test_cases) * tests_per_anon_test_case());

	gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
	if (pagemap_fd < 0)
		ksft_exit_fail_msg("opening pagemap failed\n");

	run_test_cases();
	run_anon_test_cases();

	err = ksft_get_fail_cnt();
	if (err)
+1 −1
Original line number Diff line number Diff line
@@ -186,6 +186,6 @@ fi
run_test ./soft-dirty

# COW tests for anonymous memory
run_test ./anon_cow
run_test ./cow

exit $exitcode