Unverified Commit ea6de9f6 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!3921 mm: mem_reliable: Introduce memory reliable

Merge Pull Request from: @ci-robot 
 
PR sync from: Wupeng Ma <mawupeng1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/IGGF4NQ2OHXEVHFTNO4A6DC6ODAQA42Q/ 
From: Ma Wupeng <mawupeng1@huawei.com>

Introduction

============

Memory reliable feature is a memory tiering mechanism. It is based on
kernel mirror feature, which splits memory into two separate regions,
mirrored(reliable) region and non-mirrored (non-reliable) region.

for kernel mirror feature:

- allocate kernel memory from mirrored region by default
- allocate user memory from non-mirrored region by default

non-mirrored region will be arranged into ZONE_MOVABLE.

for kernel reliable feature, it has additional features below:

- normal user tasks never alloc memory from mirrored region with userspace
  apis(malloc, mmap, etc.)
- special user tasks will allocate memory from mirrored region by default
- shmem/pagecache allocate memory from mirrored region by default
- upper limit of mirrored region allcated for user tasks, shmem and
  page cache

Support Reliable fallback mechanism which allows special user tasks, shmem
and page cache can fallback to alloc non-mirrored region, it's the default
setting.

In order to fulfil the goal

- GFP_KERNEL flag added for task to alloc memory from mirrored region.
- the high_zoneidx for special user tasks/shmem/pagecache is set to
  ZONE_NORMAL to alloc memory from mirrored region.
- normal user tasks can only alloc memory from ZONE_MOVABLE.

Changelog since v1:
- update bugzilla url.

Chen Wandun (1):
  mm: mem_reliable: Alloc pagecache from reliable region

Ma Wupeng (16):
  proc: introduce proc_hide_ents to hide proc files
  efi: Disable mirror feature during crashkernel
  mm: mem_reliable: Introduce memory reliable
  mm: mem_reliable: Alloc task memory from reliable region
  mm: mem_reliable: Add memory reliable support during hugepaged
    collapse
  mm/memblock: Introduce ability to alloc memory from specify memory
    region
  mm/hugetlb: Allocate non-mirrored memory by default
  mm: mem_reliable: Count reliable page cache usage
  mm: mem_reliable: Count reliable shmem usage
  mm: mem_reliable: Show reliable meminfo
  mm: mem_reliable: Add limiting the usage of reliable memory
  mm: mem_reliable: Introduce fallback mechanism for memory reliable
  proc: mem_reliable: Count reliable memory usage of reliable tasks
  mm: mem_reliable: Introduce proc interface to disable memory reliable
    features
  mm: mem_reliable: Show debug info about memory reliable if oom occurs
  config: enable MEMORY_RELIABLE by default

Peng Wu (1):
  mm: mem_reliable: Add cmdline reliable_debug to enable separate
    feature

Zhou Guanghui (1):
  shmem: mem_reliable: Alloc shmem from reliable region


-- 
2.25.1
 
https://gitee.com/openeuler/kernel/issues/I8USBA 
 
Link:https://gitee.com/openeuler/kernel/pulls/3921

 

Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: default avatarZucheng Zheng <zhengzucheng@huawei.com>
Reviewed-by: default avatarLiu Chao <liuchao173@huawei.com>
Reviewed-by: default avatarXu Kuohai <xukuohai@huawei.com>
Reviewed-by: default avatarWeilong Chen <chenweilong@huawei.com>
Reviewed-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
parents cb3c3cb3 36f2fd0d
Loading
Loading
Loading
Loading
+14 −1
Original line number Diff line number Diff line
@@ -2449,7 +2449,7 @@
	keepinitrd	[HW,ARM]

	kernelcore=	[KNL,X86,IA-64,PPC]
			Format: nn[KMGTPE] | nn% | "mirror"
			Format: nn[KMGTPE] | nn% | "mirror" | "reliable"
			This parameter specifies the amount of memory usable by
			the kernel for non-movable allocations.  The requested
			amount is spread evenly throughout all nodes in the
@@ -2473,6 +2473,10 @@
			for Movable pages.  "nn[KMGTPE]", "nn%", and "mirror"
			are exclusive, so you cannot specify multiple forms.

			Option "reliable" is base on option "mirror", but make
			some extension. These two features are alternatives.
			Current only arm64 is supported.

	kgdbdbgp=	[KGDB,HW] kgdb over EHCI usb debug port.
			Format: <Controller#>[,poll interval]
			The controller # is the number of the ehci usb debug
@@ -5514,6 +5518,15 @@
			[KNL, SMP] Set scheduler's default relax_domain_level.
			See Documentation/admin-guide/cgroup-v1/cpusets.rst.

	reliable_debug=	[ARM64]
			Format: [P][,S][,F]
			Only works with CONFIG_MEMORY_RELIABLE and
			"kernelcore=reliable" is configured.
			P: Page cache does not use the reliable memory.
			S: The shmem does not use the reliable memory.
			F: User memory allocation(special user task, tmpfs) will
			not allocate memory from non-mirrored region if failed.

	reserve=	[KNL,BUGS] Force kernel to ignore I/O ports or memory
			Format: <base1>,<size1>[,<base2>,<size2>,...]
			Reserve I/O ports or memory so the kernel won't use
+30 −0
Original line number Diff line number Diff line
@@ -163,6 +163,8 @@ usually fail with ESRCH.
		can be derived from smaps, but is faster and more convenient
 numa_maps	An extension based on maps, showing the memory locality and
		binding policy as well as mem usage (in pages) of each mapping.
 reliable	Present with CONFIG_MEMORY_RELIABLE=y. Task reliable status
		information
 =============  ===============================================================

For example, to get the status information of a process, all you have to do is
@@ -195,6 +197,7 @@ read the file /proc/PID/status::
  VmPTE:        20 kb
  VmSwap:        0 kB
  HugetlbPages:          0 kB
  Reliable:         1608 kB
  CoreDumping:    0
  THP_enabled:	  1
  Threads:        1
@@ -278,6 +281,7 @@ It's slow but very precise.
 VmSwap                      amount of swap used by anonymous private data
                             (shmem swap usage is not included)
 HugetlbPages                size of hugetlb memory portions
 Reliable                    size of reliable memory used
 CoreDumping                 process's memory is currently being dumped
                             (killing the process may lead to a corrupted core)
 THP_enabled		     process is allowed to use THP (returns 0 when
@@ -674,6 +678,10 @@ Where:
node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page
size, in KB, that is backing the mapping up.

The /proc/pid/reliable is used to control user task's reliable status.
Task with this flag can only alloc memory from mirrored region. Global
init task's reliable flag can not be accessed.

1.2 Kernel data
---------------

@@ -1021,6 +1029,13 @@ Example output. You may not have all of these fields.
    DirectMap4k:      401152 kB
    DirectMap2M:    10008576 kB
    DirectMap1G:    24117248 kB
    ReliableTotal:     8190696 kB
    ReliableUsed:       252912 kB
    ReliableTaskUsed:   108136 kB
    ReliableBuddyMem:  7937784 kB
    ReliableShmem:         840 kB
    FileCache:          104944 kB
    ReliableFileCache:   102688 kB

MemTotal
              Total usable RAM (i.e. physical RAM minus a few reserved
@@ -1185,6 +1200,21 @@ HugePages_Total, HugePages_Free, HugePages_Rsvd, HugePages_Surp, Hugepagesize, H
DirectMap4k, DirectMap2M, DirectMap1G
              Breakdown of page table sizes used in the kernel's
              identity mapping of RAM
ReliableTotal
              Total reliable memory size
ReliableUsed
              The used amount of reliable memory
ReliableTaskUsed
              Size of mirrored memory used by user task
ReliableBuddyMem
              Size of unused mirrored memory in buddy system
ReliableShmem
              Total reliable memory used by share memory
FileCache
              Memory usage of page cache
ReliableFileCache
              Reliable memory usage of page cache


vmallocinfo
~~~~~~~~~~~
+1 −0
Original line number Diff line number Diff line
@@ -1147,6 +1147,7 @@ CONFIG_LRU_GEN=y
CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y
CONFIG_PER_VMA_LOCK=y
CONFIG_LOCK_MM_AND_FIND_VMA=y
CONFIG_MEMORY_RELIABLE=y

#
# Data Access Monitoring
+1 −0
Original line number Diff line number Diff line
@@ -34,3 +34,4 @@ proc-$(CONFIG_PROC_VMCORE) += vmcore.o
proc-$(CONFIG_PRINTK)	+= kmsg.o
proc-$(CONFIG_PROC_PAGE_MONITOR)	+= page.o
proc-$(CONFIG_BOOT_CONFIG)	+= bootconfig.o
proc-$(CONFIG_MEMORY_RELIABLE)	+= mem_reliable.o
+19 −2
Original line number Diff line number Diff line
@@ -2657,6 +2657,14 @@ static struct dentry *proc_pident_instantiate(struct dentry *dentry,
	return d_splice_alias(inode, dentry);
}

static bool proc_hide_pidents(const struct pid_entry *p)
{
	if (mem_reliable_hide_file(p->name))
		return true;

	return false;
}

static struct dentry *proc_pident_lookup(struct inode *dir, 
					 struct dentry *dentry,
					 const struct pid_entry *p,
@@ -2675,6 +2683,8 @@ static struct dentry *proc_pident_lookup(struct inode *dir,
	for (; p < end; p++) {
		if (p->len != dentry->d_name.len)
			continue;
		if (proc_hide_pidents(p))
			continue;
		if (!memcmp(dentry->d_name.name, p->name, p->len)) {
			res = proc_pident_instantiate(dentry, task, p);
			break;
@@ -2701,7 +2711,8 @@ static int proc_pident_readdir(struct file *file, struct dir_context *ctx,
		goto out;

	for (p = ents + (ctx->pos - 2); p < ents + nents; p++) {
		if (!proc_fill_cache(file, ctx, p->name, p->len,
		if (!proc_hide_pidents(p) &&
		    !proc_fill_cache(file, ctx, p->name, p->len,
				     proc_pident_instantiate, task, p))
			break;
		ctx->pos++;
@@ -3382,6 +3393,9 @@ static const struct pid_entry tgid_base_stuff[] = {
	ONE("oom_score",  S_IRUGO, proc_oom_score),
	REG("oom_adj",    S_IRUGO|S_IWUSR, proc_oom_adj_operations),
	REG("oom_score_adj", S_IRUGO|S_IWUSR, proc_oom_score_adj_operations),
#ifdef CONFIG_MEMORY_RELIABLE
	REG("reliable", S_IRUGO|S_IWUSR, proc_reliable_operations),
#endif
#ifdef CONFIG_AUDIT
	REG("loginuid",   S_IWUSR|S_IRUGO, proc_loginuid_operations),
	REG("sessionid",  S_IRUGO, proc_sessionid_operations),
@@ -3731,6 +3745,9 @@ static const struct pid_entry tid_base_stuff[] = {
	ONE("oom_score", S_IRUGO, proc_oom_score),
	REG("oom_adj",   S_IRUGO|S_IWUSR, proc_oom_adj_operations),
	REG("oom_score_adj", S_IRUGO|S_IWUSR, proc_oom_score_adj_operations),
#ifdef CONFIG_MEMORY_RELIABLE
	REG("reliable", S_IRUGO|S_IWUSR, proc_reliable_operations),
#endif
#ifdef CONFIG_AUDIT
	REG("loginuid",  S_IWUSR|S_IRUGO, proc_loginuid_operations),
	REG("sessionid",  S_IRUGO, proc_sessionid_operations),
Loading