Unverified Commit c5a37a37 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!220 Intel Advanced Matrix Extensions (AMX) support on SPR

Merge Pull Request from: @Linwang_68f8 
 
Content:
Intel® Advanced Matrix Extensions (Intel® AMX) is a new 64-bit programming paradigm consisting of two components: a set of 2-dimensional registers (tiles) representing sub-arrays from a larger 2-dimensional memory image, and an accelerator able to operate on tiles, the first implementation is called TMUL (tile matrix multiply unit).

This patch set involves 182 patches including KABI fixes from Zheng Zengkai <zhengzengkai@huawei.com>

Please be noticed that to keep KABI consistency following 9 commits have to be dropped:
0c2e62ba x86/extable: Remove EX_TYPE_FAULT from MCE safe fixups
c6304556 x86/fpu: Use EX_TYPE_FAULT_MCE_SAFE for exception fixups
c1c97d17 x86/copy_mc: Use EX_TYPE_DEFAULT_MCE_SAFE for exception fixups
2cadf524 x86/extable: Provide EX_TYPE_DEFAULT_MCE_SAFE and EX_TYPE_FAULT_MCE_SAFE
46d28947 x86/extable: Rework the exception table mechanics
083b32d6 x86/mce: Get rid of stray semicolons
e42404af x86/mce: Deduplicate exception handling
32fd8b59 x86/extable: Get rid of redundant macros
326b567f x86/extable: Tidy up redundant handler functions

Intel-kernel issue:
https://gitee.com/openeuler/intel-kernel/issues/I590ZC

Test environment:
openEuler 22.09 + backporting kernel

Test cases:
kernel self-test including sigaltstack and AMX state management testing.
TMUL functional testing.
AMX stress.
Context switch testing.
INT8/BF16 online inference.

Known issue:
N/A

Default config change:



```
@@ -479,6 +494,7 @@ CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_MODIFY_LDT_SYSCALL=y
+# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
CONFIG_HAVE_LIVEPATCH_FTRACE=y
CONFIG_HAVE_LIVEPATCH_WO_FTRACE=y
 
@@ -845,6 +861,7 @@ CONFIG_HAVE_STATIC_CALL=y
CONFIG_HAVE_STATIC_CALL_INLINE=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
+CONFIG_DYNAMIC_SIGFRAME=y
```

 
 
Link:https://gitee.com/openeuler/kernel/pulls/220

 
Reviewed-by: default avatarLiu Chao <liuchao173@huawei.com>
Reviewed-by: default avatarChen Wei <chenwei@xfusion.com>
Reviewed-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: default avatarJun Tian <jun.j.tian@intel.com>
Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
parents 05e8c95f 6411e75a
Loading
Loading
Loading
Loading
+9 −0
Original line number Diff line number Diff line
@@ -5474,6 +5474,15 @@
	stifb=		[HW]
			Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]]

        strict_sas_size=
			[X86]
			Format: <bool>
			Enable or disable strict sigaltstack size checks
			against the required signal frame size which
			depends on the supported FPU features. This can
			be used to filter out binaries which have
			not yet been made aware of AT_MINSIGSTKSZ.

	sunrpc.min_resvport=
	sunrpc.max_resvport=
			[NFS,SUNRPC]
+53 −0
Original line number Diff line number Diff line
.. SPDX-License-Identifier: GPL-2.0

==================================
x86-specific ELF Auxiliary Vectors
==================================

This document describes the semantics of the x86 auxiliary vectors.

Introduction
============

ELF Auxiliary vectors enable the kernel to efficiently provide
configuration-specific parameters to userspace. In this example, a program
allocates an alternate stack based on the kernel-provided size::

   #include <sys/auxv.h>
   #include <elf.h>
   #include <signal.h>
   #include <stdlib.h>
   #include <assert.h>
   #include <err.h>

   #ifndef AT_MINSIGSTKSZ
   #define AT_MINSIGSTKSZ	51
   #endif

   ....
   stack_t ss;

   ss.ss_sp = malloc(ss.ss_size);
   assert(ss.ss_sp);

   ss.ss_size = getauxval(AT_MINSIGSTKSZ) + SIGSTKSZ;
   ss.ss_flags = 0;

   if (sigaltstack(&ss, NULL))
        err(1, "sigaltstack");


The exposed auxiliary vectors
=============================

AT_SYSINFO is used for locating the vsyscall entry point.  It is not
exported on 64-bit mode.

AT_SYSINFO_EHDR is the start address of the page containing the vDSO.

AT_MINSIGSTKSZ denotes the minimum stack size required by the kernel to
deliver a signal to user-space.  AT_MINSIGSTKSZ comprehends the space
consumed by the kernel to accommodate the user context for the current
hardware configuration.  It does not comprehend subsequent user-space stack
consumption, which must be added by the user.  (e.g. Above, user-space adds
SIGSTKSZ to AT_MINSIGSTKSZ.)
+2 −0
Original line number Diff line number Diff line
@@ -36,3 +36,5 @@ x86-specific Documentation
   x86_64/index
   sva
   sgx
   elf_auxvec
   xstate
+74 −0
Original line number Diff line number Diff line
Using XSTATE features in user space applications
================================================

The x86 architecture supports floating-point extensions which are
enumerated via CPUID. Applications consult CPUID and use XGETBV to
evaluate which features have been enabled by the kernel XCR0.

Up to AVX-512 and PKRU states, these features are automatically enabled by
the kernel if available. Features like AMX TILE_DATA (XSTATE component 18)
are enabled by XCR0 as well, but the first use of related instruction is
trapped by the kernel because by default the required large XSTATE buffers
are not allocated automatically.

Using dynamically enabled XSTATE features in user space applications
--------------------------------------------------------------------

The kernel provides an arch_prctl(2) based mechanism for applications to
request the usage of such features. The arch_prctl(2) options related to
this are:

-ARCH_GET_XCOMP_SUPP

 arch_prctl(ARCH_GET_XCOMP_SUPP, &features);

 ARCH_GET_XCOMP_SUPP stores the supported features in userspace storage of
 type uint64_t. The second argument is a pointer to that storage.

-ARCH_GET_XCOMP_PERM

 arch_prctl(ARCH_GET_XCOMP_PERM, &features);

 ARCH_GET_XCOMP_PERM stores the features for which the userspace process
 has permission in userspace storage of type uint64_t. The second argument
 is a pointer to that storage.

-ARCH_REQ_XCOMP_PERM

 arch_prctl(ARCH_REQ_XCOMP_PERM, feature_nr);

 ARCH_REQ_XCOMP_PERM allows to request permission for a dynamically enabled
 feature or a feature set. A feature set can be mapped to a facility, e.g.
 AMX, and can require one or more XSTATE components to be enabled.

 The feature argument is the number of the highest XSTATE component which
 is required for a facility to work.

When requesting permission for a feature, the kernel checks the
availability. The kernel ensures that sigaltstacks in the process's tasks
are large enough to accommodate the resulting large signal frame. It
enforces this both during ARCH_REQ_XCOMP_SUPP and during any subsequent
sigaltstack(2) calls. If an installed sigaltstack is smaller than the
resulting sigframe size, ARCH_REQ_XCOMP_SUPP results in -ENOSUPP. Also,
sigaltstack(2) results in -ENOMEM if the requested altstack is too small
for the permitted features.

Permission, when granted, is valid per process. Permissions are inherited
on fork(2) and cleared on exec(3).

The first use of an instruction related to a dynamically enabled feature is
trapped by the kernel. The trap handler checks whether the process has
permission to use the feature. If the process has no permission then the
kernel sends SIGILL to the application. If the process has permission then
the handler allocates a larger xstate buffer for the task so the large
state can be context switched. In the unlikely cases that the allocation
fails, the kernel sends SIGSEGV.

Dynamic features in signal frames
---------------------------------

Dynamcally enabled features are not written to the signal frame upon signal
entry if the feature is in its initial configuration.  This differs from
non-dynamic features which are always written regardless of their
configuration.  Signal handlers can examine the XSAVE buffer's XSTATE_BV
field to determine if a features was written.
+3 −0
Original line number Diff line number Diff line
@@ -1159,6 +1159,9 @@ config ARCH_SPLIT_ARG64
config HAVE_ARCH_NODE_DEV_GROUP
	bool

config DYNAMIC_SIGFRAME
	bool

source "kernel/gcov/Kconfig"

source "scripts/gcc-plugins/Kconfig"
Loading