Commit fdbe4cbc authored by Xueshi Hu's avatar Xueshi Hu Committed by Liu Shixin
Browse files

mm/hugetlb: fix nodes huge page allocation when there are surplus pages

mainline inclusion
from mainline-v6.7-rc1
commit b72b3c9c34c825c81d205241c5f822fc7835923f
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBE66K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b72b3c9c34c825c81d205241c5f822fc7835923f

--------------------------------

In set_nr_huge_pages(), local variable "count" is used to record
persistent_huge_pages(), but when it cames to nodes huge page allocation,
the semantics changes to nr_huge_pages.  When there exists surplus huge
pages and using the interface under
/sys/devices/system/node/node*/hugepages to change huge page pool size,
this difference can result in the allocation of an unexpected number of
huge pages.

Steps to reproduce the bug:

Starting with:

				  Node 0          Node 1    Total
	HugePages_Total             0.00            0.00     0.00
	HugePages_Free              0.00            0.00     0.00
	HugePages_Surp              0.00            0.00     0.00

create 100 huge pages in Node 0 and consume it, then set Node 0 's
nr_hugepages to 0.

yields:

				  Node 0          Node 1    Total
	HugePages_Total           200.00            0.00   200.00
	HugePages_Free              0.00            0.00     0.00
	HugePages_Surp            200.00            0.00   200.00

write 100 to Node 1's nr_hugepages

		echo 100 > /sys/devices/system/node/node1/\
	hugepages/hugepages-2048kB/nr_hugepages

gets:

				  Node 0          Node 1    Total
	HugePages_Total           200.00          400.00   600.00
	HugePages_Free              0.00          400.00   400.00
	HugePages_Surp            200.00            0.00   200.00

Kernel is expected to create only 100 huge pages and it gives 200.

Link: https://lkml.kernel.org/r/20230829033343.467779-1-xueshi.hu@smartx.com


Fixes: 9a305230 ("hugetlb: add per node hstate attributes")
Signed-off-by: default avatarXueshi Hu <xueshi.hu@smartx.com>
Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
parent 343e35e5
Loading
Loading
Loading
Loading
+3 −1
Original line number Diff line number Diff line
@@ -2837,7 +2837,9 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
	if (nid != NUMA_NO_NODE) {
		unsigned long old_count = count;

		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
		count += persistent_huge_pages(h) -
			 (h->nr_huge_pages_node[nid] -
			  h->surplus_huge_pages_node[nid]);
		/*
		 * User may have specified a large count value which caused the
		 * above calculation to overflow.  In this case, they wanted