Commit 67d02b4d authored by Roman Gushchin's avatar Roman Gushchin Committed by Zheng Zengkai
Browse files

percpu: implement partial chunk depopulation

mainline inclusion
from mainline-v5.14-rc1
commit f1833241
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4BE79


CVE: NA

-------------------------------------------------
From Roman ("percpu: partial chunk depopulation"):
In our [Facebook] production experience the percpu memory allocator is
sometimes struggling with returning the memory to the system. A typical
example is a creation of several thousands memory cgroups (each has
several chunks of the percpu data used for vmstats, vmevents,
ref counters etc). Deletion and complete releasing of these cgroups
doesn't always lead to a shrinkage of the percpu memory, so that
sometimes there are several GB's of memory wasted.

The underlying problem is the fragmentation: to release an underlying
chunk all percpu allocations should be released first. The percpu
allocator tends to top up chunks to improve the utilization. It means
new small-ish allocations (e.g. percpu ref counters) are placed onto
almost filled old-ish chunks, effectively pinning them in memory.

This patchset solves this problem by implementing a partial depopulation
of percpu chunks: chunks with many empty pages are being asynchronously
depopulated and the pages are returned to the system.

To illustrate the problem the following script can be used:
Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>

Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
parent 8e7c786a
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment