Unverified Commit f7cf91c9 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!235 Introduce memory reliable

Merge Pull Request from: @ma-wupeng 
 
Introduction

============

Memory reliable feature is a memory tiering mechanism. It is based on
kernel mirror feature, which splits memory into two sperate regions,
mirrored(reliable) region and non-mirrored (non-reliable) region.

for kernel mirror feature:

- allocate kernel memory from mirrored region by default
- allocate user memory from non-mirrored region by default

non-mirrored region will be arranged into ZONE_MOVABLE.

for kernel reliable feature, it has additional features below:

- normal user tasks never alloc memory from mirrored region with userspace
  apis(malloc, mmap, etc.)
- special user tasks will allocate memory from mirrored region by default
- tmpfs/pagecache allocate memory from mirrored region by default
- upper limit of mirrored region allcated for user tasks, tmpfs and
  pagecache

Support Reliable fallback mechanism which allows special user tasks, tmpfs
and pagecache can fallback to alloc non-mirrored region, it's the default
setting.

In order to fulfil the goal

- ___GFP_RELIABLE flag added for alloc memory from mirrored region.

- the high_zoneidx for special user tasks/tmpfs/pagecache is set to
  ZONE_NORMAL.

- normal user tasks could only alloc from ZONE_MOVABLE.

This patch is just the main framework, memory reliable support for special
user tasks, pagecache and tmpfs has own patches.

To enable this function, mirrored(reliable) memory is needed and
"kernelcore=reliable" should be added to kernel parameters.

PR for 22.09: https://gitee.com/openeuler/kernel/pulls/79 
 
Link:https://gitee.com/openeuler/kernel/pulls/235

 
Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
parents 61f39efb 426b9efe
Loading
Loading
Loading
Loading
+13 −0
Original line number Diff line number Diff line
@@ -539,6 +539,10 @@

	cio_ignore=	[S390]
			See Documentation/s390/common_io.rst for details.

	clear_freelist
			Enable clear_freelist feature.

	clk_ignore_unused
			[CLK]
			Prevents the clock framework from automatically gating
@@ -4785,6 +4789,15 @@
			[KNL, SMP] Set scheduler's default relax_domain_level.
			See Documentation/admin-guide/cgroup-v1/cpusets.rst.

	reliable_debug=	[ARM64]
			Format: [F][,S][,P]
			Only works with CONFIG_MEMORY_RELIABLE and
			"kernelcore=reliable" is configured.
			F: User memory allocation(special user task, tmpfs) will
			not allocate memory from non-mirrored region if failed.
			S: The shmem does not use the reliable memory.
			P: Page cache does not use the reliable memory.

	reserve=	[KNL,BUGS] Force kernel to ignore I/O ports or memory
			Format: <base1>,<size1>[,<base2>,<size2>,...]
			Reserve I/O ports or memory so the kernel won't use
+13 −0
Original line number Diff line number Diff line
@@ -25,6 +25,7 @@ files can be found in mm/swap.c.
Currently, these files are in /proc/sys/vm:

- admin_reserve_kbytes
- clear_freelist_pages
- compact_memory
- compaction_proactiveness
- compact_unevictable_allowed
@@ -109,6 +110,18 @@ On x86_64 this is about 128MB.
Changing this takes effect whenever an application requests memory.


clear_freelist_pages
====================

Available only when CONFIG_CLEAR_FREELIST_PAGE is set. When 1 is written to the
file, all pages in free lists will be written with 0.

Zone lock is held during clear_freelist_pages, if the execution time is too
long, RCU CPU Stall warnings will be print. For each NUMA node,
clear_freelist_pages is performed on a "random" CPU of the NUMA node.
The time consuming is related to the hardware.


compact_memory
==============

+8 −0
Original line number Diff line number Diff line
@@ -195,6 +195,7 @@ read the file /proc/PID/status::
  VmPTE:        20 kb
  VmSwap:        0 kB
  HugetlbPages:          0 kB
  Reliable:         1608 kB
  CoreDumping:    0
  THP_enabled:	  1
  Threads:        1
@@ -275,6 +276,7 @@ It's slow but very precise.
 VmSwap                      amount of swap used by anonymous private data
                             (shmem swap usage is not included)
 HugetlbPages                size of hugetlb memory portions
 Reliable                    size of reliable memory used
 CoreDumping                 process's memory is currently being dumped
                             (killing the process may lead to a corrupted core)
 THP_enabled		     process is allowed to use THP (returns 0 when
@@ -971,6 +973,8 @@ varies by architecture and compile options. The following is from a
    ShmemPmdMapped:      0 kB
    ReliableTotal: 7340032 kB
    ReliableUsed:   418824 kB
    ReliableBuddyMem: 418824 kB
    ReliableShmem:        96 kB

MemTotal
              Total usable RAM (i.e. physical RAM minus a few reserved
@@ -1104,6 +1108,10 @@ ReliableTotal
              Total reliable memory size
ReliableUsed
              The used amount of reliable memory
ReliableBuddyMem
              Size of unused mirrored memory in buddy system
ReliableShmem
              Total reliable memory used by share memory

vmallocinfo
~~~~~~~~~~~
+3 −3
Original line number Diff line number Diff line
@@ -1264,14 +1264,14 @@ static const struct file_operations proc_oom_score_adj_operations = {
static inline int reliable_check(struct task_struct *task, struct pid *pid)
{
	if (!mem_reliable_is_enabled())
		return -EPERM;
		return -EACCES;

	if (is_global_init(task))
		return -EPERM;
		return -EINVAL;

	if (!task->mm || (task->flags & PF_KTHREAD) ||
	    (task->flags & PF_EXITING))
		return -EPERM;
		return -EINVAL;

	return 0;
}
+1 −0
Original line number Diff line number Diff line
@@ -77,6 +77,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
	SEQ_PUT_DEC(" kB\nVmSwap:\t", swap);
	seq_puts(m, " kB\n");
	hugetlb_report_usage(m, mm);
	reliable_report_usage(m, mm);
}
#undef SEQ_PUT_DEC

Loading