Commit 9030fb0b authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
parents 3bf03b9a 2a3c4bce
Loading
Loading
Loading
Loading
+9 −9
Original line number Diff line number Diff line
@@ -55,18 +55,18 @@ flags the caller provides. The caller is required to pass in a non-null struct
pages* array, and the function then pins pages by incrementing each by a special
value: GUP_PIN_COUNTING_BIAS.

For huge pages (and in fact, any compound page of more than 2 pages), the
GUP_PIN_COUNTING_BIAS scheme is not used. Instead, an exact form of pin counting
is achieved, by using the 3rd struct page in the compound page. A new struct
page field, hpage_pinned_refcount, has been added in order to support this.
For compound pages, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
an exact form of pin counting is achieved, by using the 2nd struct page
in the compound page. A new struct page field, compound_pincount, has
been added in order to support this.

This approach for compound pages avoids the counting upper limit problems that
are discussed below. Those limitations would have been aggravated severely by
huge pages, because each tail page adds a refcount to the head page. And in
fact, testing revealed that, without a separate hpage_pinned_refcount field,
fact, testing revealed that, without a separate compound_pincount field,
page overflows were seen in some huge page stress tests.

This also means that huge pages and compound pages (of order > 1) do not suffer
This also means that huge pages and compound pages do not suffer
from the false positives problem that is mentioned below.::

 Function
@@ -264,9 +264,9 @@ place.)
Other diagnostics
=================

dump_page() has been enhanced slightly, to handle these new counting fields, and
to better report on compound pages in general. Specifically, for compound pages
with order > 1, the exact (hpage_pinned_refcount) pincount is reported.
dump_page() has been enhanced slightly, to handle these new counting
fields, and to better report on compound pages in general. Specifically,
for compound pages, the exact (compound_pincount) pincount is reported.

References
==========
+1 −0
Original line number Diff line number Diff line
@@ -233,6 +233,7 @@ pmd_page_vaddr(pmd_t pmd)
	return ((pmd_val(pmd) & _PFN_MASK) >> (32-PAGE_SHIFT)) + PAGE_OFFSET;
}

#define pmd_pfn(pmd)	(pmd_val(pmd) >> 32)
#define pmd_page(pmd)	(pfn_to_page(pmd_val(pmd) >> 32))
#define pud_page(pud)	(pfn_to_page(pud_val(pud) >> 32))

+0 −1
Original line number Diff line number Diff line
@@ -31,7 +31,6 @@ static inline pmd_t pte_pmd(pte_t pte)

#define pmd_write(pmd)		pte_write(pmd_pte(pmd))
#define pmd_young(pmd)		pte_young(pmd_pte(pmd))
#define pmd_pfn(pmd)		pte_pfn(pmd_pte(pmd))
#define pmd_dirty(pmd)		pte_dirty(pmd_pte(pmd))

#define mk_pmd(page, prot)	pte_pmd(mk_pte(page, prot))
+1 −0
Original line number Diff line number Diff line
@@ -161,6 +161,7 @@
#define pmd_present(x)		(pmd_val(x))
#define pmd_clear(xp)		do { pmd_val(*(xp)) = 0; } while (0)
#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & PAGE_MASK)
#define pmd_pfn(pmd)		((pmd_val(pmd) & PAGE_MASK) >> PAGE_SHIFT)
#define pmd_page(pmd)		virt_to_page(pmd_page_vaddr(pmd))
#define set_pmd(pmdp, pmd)	(*(pmdp) = pmd)
#define pmd_pgtable(pmd)	((pgtable_t) pmd_page_vaddr(pmd))
+2 −0
Original line number Diff line number Diff line
@@ -208,6 +208,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
}
#define pmd_offset pmd_offset

#define pmd_pfn(pmd)		(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))

#define pmd_large(pmd)		(pmd_val(pmd) & 2)
#define pmd_leaf(pmd)		(pmd_val(pmd) & 2)
#define pmd_bad(pmd)		(pmd_val(pmd) & 2)
Loading