Commit f2b79c0d authored by Aneesh Kumar K.V's avatar Aneesh Kumar K.V Committed by Andrew Morton
Browse files

powerpc/book3s64/radix: add support for vmemmap optimization for radix

With 2M PMD-level mapping, we require 32 struct pages and a single vmemmap
page can contain 1024 struct pages (PAGE_SIZE/sizeof(struct page)).  Hence
with 64K page size, we don't use vmemmap deduplication for PMD-level
mapping.

[aneesh.kumar@linux.ibm.com: ppc64: don't include radix headers if CONFIG_PPC_RADIX_MMU=n]
  Link: https://lkml.kernel.org/r/87zg3jw8km.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20230724190759.483013-12-aneesh.kumar@linux.ibm.com


Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 368a0590
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -210,6 +210,7 @@ the device (altmap).

The following page sizes are supported in DAX: PAGE_SIZE (4K on x86_64),
PMD_SIZE (2M on x86_64) and PUD_SIZE (1G on x86_64).
For powerpc equivalent details see Documentation/powerpc/vmemmap_dedup.rst

The differences with HugeTLB are relatively minor.

+1 −0
Original line number Diff line number Diff line
@@ -36,6 +36,7 @@ powerpc
    ultravisor
    vas-api
    vcpudispatch_stats
    vmemmap_dedup

    features

+101 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

==========
Device DAX
==========

The device-dax interface uses the tail deduplication technique explained in
Documentation/mm/vmemmap_dedup.rst

On powerpc, vmemmap deduplication is only used with radix MMU translation. Also
with a 64K page size, only the devdax namespace with 1G alignment uses vmemmap
deduplication.

With 2M PMD level mapping, we require 32 struct pages and a single 64K vmemmap
page can contain 1024 struct pages (64K/sizeof(struct page)). Hence there is no
vmemmap deduplication possible.

With 1G PUD level mapping, we require 16384 struct pages and a single 64K
vmemmap page can contain 1024 struct pages (64K/sizeof(struct page)). Hence we
require 16 64K pages in vmemmap to map the struct page for 1G PUD level mapping.

Here's how things look like on device-dax after the sections are populated::
 +-----------+ ---virt_to_page---> +-----------+   mapping to   +-----------+
 |           |                     |     0     | -------------> |     0     |
 |           |                     +-----------+                +-----------+
 |           |                     |     1     | -------------> |     1     |
 |           |                     +-----------+                +-----------+
 |           |                     |     2     | ----------------^ ^ ^ ^ ^ ^
 |           |                     +-----------+                   | | | | |
 |           |                     |     3     | ------------------+ | | | |
 |           |                     +-----------+                     | | | |
 |           |                     |     4     | --------------------+ | | |
 |    PUD    |                     +-----------+                       | | |
 |   level   |                     |     .     | ----------------------+ | |
 |  mapping  |                     +-----------+                         | |
 |           |                     |     .     | ------------------------+ |
 |           |                     +-----------+                           |
 |           |                     |     15    | --------------------------+
 |           |                     +-----------+
 |           |
 |           |
 |           |
 +-----------+


With 4K page size, 2M PMD level mapping requires 512 struct pages and a single
4K vmemmap page contains 64 struct pages(4K/sizeof(struct page)). Hence we
require 8 4K pages in vmemmap to map the struct page for 2M pmd level mapping.

Here's how things look like on device-dax after the sections are populated::

 +-----------+ ---virt_to_page---> +-----------+   mapping to   +-----------+
 |           |                     |     0     | -------------> |     0     |
 |           |                     +-----------+                +-----------+
 |           |                     |     1     | -------------> |     1     |
 |           |                     +-----------+                +-----------+
 |           |                     |     2     | ----------------^ ^ ^ ^ ^ ^
 |           |                     +-----------+                   | | | | |
 |           |                     |     3     | ------------------+ | | | |
 |           |                     +-----------+                     | | | |
 |           |                     |     4     | --------------------+ | | |
 |    PMD    |                     +-----------+                       | | |
 |   level   |                     |     5     | ----------------------+ | |
 |  mapping  |                     +-----------+                         | |
 |           |                     |     6     | ------------------------+ |
 |           |                     +-----------+                           |
 |           |                     |     7     | --------------------------+
 |           |                     +-----------+
 |           |
 |           |
 |           |
 +-----------+

With 1G PUD level mapping, we require 262144 struct pages and a single 4K
vmemmap page can contain 64 struct pages (4K/sizeof(struct page)). Hence we
require 4096 4K pages in vmemmap to map the struct pages for 1G PUD level
mapping.

Here's how things look like on device-dax after the sections are populated::

 +-----------+ ---virt_to_page---> +-----------+   mapping to   +-----------+
 |           |                     |     0     | -------------> |     0     |
 |           |                     +-----------+                +-----------+
 |           |                     |     1     | -------------> |     1     |
 |           |                     +-----------+                +-----------+
 |           |                     |     2     | ----------------^ ^ ^ ^ ^ ^
 |           |                     +-----------+                   | | | | |
 |           |                     |     3     | ------------------+ | | | |
 |           |                     +-----------+                     | | | |
 |           |                     |     4     | --------------------+ | | |
 |    PUD    |                     +-----------+                       | | |
 |   level   |                     |     .     | ----------------------+ | |
 |  mapping  |                     +-----------+                         | |
 |           |                     |     .     | ------------------------+ |
 |           |                     +-----------+                           |
 |           |                     |   4095    | --------------------------+
 |           |                     +-----------+
 |           |
 |           |
 |           |
 +-----------+
+1 −0
Original line number Diff line number Diff line
@@ -174,6 +174,7 @@ config PPC
	select ARCH_WANT_IPC_PARSE_VERSION
	select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
	select ARCH_WANT_LD_ORPHAN_WARN
	select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP	if PPC_RADIX_MMU
	select ARCH_WANTS_MODULES_DATA_IN_VMALLOC	if PPC_BOOK3S_32 || PPC_8xx
	select ARCH_WEAK_RELEASE_ACQUIRE
	select BINFMT_ELF
+11 −0
Original line number Diff line number Diff line
@@ -326,6 +326,7 @@ static inline pud_t radix__pud_mkdevmap(pud_t pud)
}

struct vmem_altmap;
struct dev_pagemap;
extern int __meminit radix__vmemmap_create_mapping(unsigned long start,
					     unsigned long page_size,
					     unsigned long phys);
@@ -363,5 +364,15 @@ int radix__remove_section_mapping(unsigned long start, unsigned long end);

void radix__kernel_map_pages(struct page *page, int numpages, int enable);

#ifdef CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP
#define vmemmap_can_optimize vmemmap_can_optimize
bool vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *pgmap);
#endif

#define vmemmap_populate_compound_pages vmemmap_populate_compound_pages
int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn,
					      unsigned long start,
					      unsigned long end, int node,
					      struct dev_pagemap *pgmap);
#endif /* __ASSEMBLY__ */
#endif
Loading