Commit 76e31254 authored by Chen Jun's avatar Chen Jun Committed by Kaixiong Yu
Browse files

mm/slub: Reduce memory consumption in extreme scenarios

mainline inclusion
from mainline-v6.10-rc1
commit 9198ffbd2b494daae3a67cac1d59c3a2754e64cd
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBH72Q

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9198ffbd2b494daae3a67cac1d59c3a2754e64cd



--------------------------------

When kmalloc_node() is called without __GFP_THISNODE and the target node
lacks sufficient memory, SLUB allocates a folio from a different node
other than the requested node, instead of taking a partial slab from it.

However, since the allocated folio does not belong to the requested
node, on the following allocation it is deactivated and added to the
partial slab list of the node it belongs to.

This behavior can result in excessive memory usage when the requested
node has insufficient memory, as SLUB will repeatedly allocate folios
from other nodes without reusing the previously allocated ones.

To prevent memory wastage, when a preferred node is indicated (not
NUMA_NO_NODE) but without a prior __GFP_THISNODE constraint:

1) try to get a partial slab from target node only by having
   __GFP_THISNODE in pc.flags for get_partial()
2) if 1) failed, try to allocate a new slab from target node with
   GFP_NOWAIT | __GFP_THISNODE opportunistically.
3) if 2) failed, retry with original gfpflags which will allow
   get_partial() try partial lists of other nodes before potentially
   allocating new page from other nodes

Without a preferred node, or with __GFP_THISNODE constraint, the
behavior remains unchanged.

On qemu with 4 numa nodes and each numa has 1G memory. Write a test ko
to call kmalloc_node(196, GFP_KERNEL, 3) for (4 * 1024 + 4) * 1024 times.

cat /proc/slabinfo shows:
kmalloc-256       4200530 13519712    256   32    2 : tunables..

after this patch,
cat /proc/slabinfo shows:
kmalloc-256       4200558 4200768    256   32    2 : tunables..

Signed-off-by: default avatarChen Jun <chenjun102@huawei.com>
Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
Conflicts:
	mm/slub.c
[The conflict is large. Because OLK-5.10 does not merge mainline patch
53a0de06 ("mm, slub: dissolve
new_slab_objects() into ___slab_alloc()")]
Signed-off-by: default avatarKaixiong Yu <yukaixiong@huawei.com>
parent 25d1b5de
Loading
Loading
Loading
Loading
+28 −2
Original line number Diff line number Diff line
@@ -2105,7 +2105,7 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node,
		searchnode = numa_mem_id();

	object = get_partial_node(s, get_node(s, searchnode), c, flags);
	if (object || node != NUMA_NO_NODE)
	if (object || (node != NUMA_NO_NODE && (flags & __GFP_THISNODE)))
		return object;

	return get_any_partial(s, flags, c);
@@ -2687,6 +2687,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
{
	void *freelist;
	struct page *page;
	bool try_thisnode = true;
	gfp_t pc_gfpflags;

	stat(s, ALLOC_SLOWPATH);

@@ -2764,9 +2766,33 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
		goto redo;
	}

	freelist = new_slab_objects(s, gfpflags, node, &c);
new_objects:

	pc_gfpflags = gfpflags;
	/*
	 * When a preferred node is indicated but no __GFP_THISNODE
	 *
	 * 1) try to get a partial slab from target node only by having
	 *    __GFP_THISNODE in pc_gfpflags for new_slab_objects()
	 * 2) if 1) failed, try to allocate a new slab from target node with
	 *    GPF_NOWAIT | __GFP_THISNODE opportunistically
	 * 3) if 2) failed, retry with original gfpflags which will allow
	 *    new_slab_objects() try partial lists of other nodes before potentially
	 *    allocating new page from other nodes
	 */
	if (unlikely(node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE)
		     && try_thisnode))
		pc_gfpflags = GFP_NOWAIT | __GFP_THISNODE;

	freelist = new_slab_objects(s, pc_gfpflags, node, &c);

	if (unlikely(!freelist)) {
		if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE)
		    && try_thisnode) {
			try_thisnode = false;
			goto new_objects;
		}

		slab_out_of_memory(s, gfpflags, node);
		return NULL;
	}