!4230 [OLK-5.10] Intel: backport to support RAS EDAC feature on Granite...
!4230 [OLK-5.10] Intel: backport to support RAS EDAC feature on Granite Rapids(GNR) and Sierra Forest(SRF) server Merge Pull Request from: @wjin123 Title: [OLK-5.10] Intel: backport to support RAS EDAC feature on Granite Rapids(GNR) and Sierra Forest(SRF) server Content: Backport to kernel 5.10 to support RAS EDAC feature on Intel Granite Rapids(GNR) and Sierra Forest(SRF) server. The backported upstream kernel commits lists as below: 0cfd8fba - x86/cpu: Fix Crestmont uarch c545f5e4 - EDAC/i10nm: Skip the absent memory controllers 96ae3995 - EDAC/i10nm: Add Intel Sierra Forest server support ba987eaa - EDAC/i10nm: Add Intel Granite Rapids server support dd7814b7 - EDAC/i10nm: Make more configurations CPU model specific The backported codes has been verified on Intel GNR/SRF server. Intel-kernel issue: https://gitee.com/openeuler/intel-kernel/issues/I8Y47N?from=project-issue Test: BIOS setting: System Event Log -> WHEA Settings ->WHEA Support =Enable System Event Log -> Error Injection Settings -> WHEA Error Injection Support = Enable 1.After GNR/SRF system power on, check if EDAC module installed with command "lsmod | grep -i edac", the EDAC module "i10nm_edac" should be found, i.e., "i10nm_edac 24576 0". 2. insert einj.ko kernel module with command "modprobe einj", then inject memory CE error with ras-tools command "./cmcistorm 1 1". 3. check EDAC message in dmesg, the EDAC message of detail CE memory error decoding info should be found in dmesg, similar as:"[ 148.514145] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_MC#3_Chan#0_DIMM#0 (channel:0 slot:0 page:0x11cc51 offset:0x480 grain:32 syndrome:0x0 - err_code:0x0080:0x0090 SystemAddress:0x11cc51480 ProcessorSocketId:0x0 MemoryControllerId:0x3 ChannelAddress:0x1398a380 ChannelId:0x0 RankAddress:0x4e62880 PhysicalRankId:0x1 DimmSlotId:0x0 DimmRankId:0x1 Row:0x4e4 Column:0x100 Bank:0x3 BankGroup:0x7 ChipSelect:0x1)" The ras-tools can be downloaded from below link: https://git.kernel.org/pub/scm/linux/kernel/git/aegl/ras-tools.git Known issue: N/A Default config change: N/A Link:https://gitee.com/openeuler/kernel/pulls/4230 Reviewed-by:Jason Zeng <jason.zeng@intel.com> Signed-off-by:
Jialin Zhang <zhangjialin11@huawei.com>
Loading
Please sign in to comment