Commit 7c3dc440 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull Compute Express Link (CXL) updates from Dan Williams:
 "To date Linux has been dependent on platform-firmware to map CXL RAM
  regions and handle events / errors from devices. With this update we
  can now parse / update the CXL memory layout, and report events /
  errors from devices. This is a precursor for the CXL subsystem to
  handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that
  for DDR-attached-DRAM is handled by the EDAC driver where it maps
  system physical address events to a field-replaceable-unit (FRU /
  endpoint device). In general, CXL has the potential to standardize
  what has historically been a pile of memory-controller-specific error
  handling logic.

  Another change of note is the default policy for handling RAM-backed
  device-dax instances. Previously the default access mode was "device",
  mmap(2) a device special file to access memory. The new default is
  "kmem" where the address range is assigned to the core-mm via
  add_memory_driver_managed(). This saves typical users from wondering
  why their platform memory is not visible via free(1) and stuck behind
  a device-file. At the same time it allows expert users to deploy
  policy to, for example, get dedicated access to high performance
  memory, or hide low performance memory from general purpose kernel
  allocations. This affects not only CXL, but also systems with
  high-bandwidth-memory that platform-firmware tags with the
  EFI_MEMORY_SP (special purpose) designation.

  Summary:

   - CXL RAM region enumeration: instantiate 'struct cxl_region' objects
     for platform firmware created memory regions

   - CXL RAM region provisioning: complement the existing PMEM region
     creation support with RAM region support

   - "Soft Reservation" policy change: Online (memory hot-add)
     soft-reserved memory (EFI_MEMORY_SP) by default, but still allow
     for setting aside such memory for dedicated access via device-dax.

   - CXL Events and Interrupts: Takeover CXL event handling from
     platform-firmware (ACPI calls this CXL Memory Error Reporting) and
     export CXL Events via Linux Trace Events.

   - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
     subsystem interrogate the result of CXL _OSC negotiation.

   - Emulate CXL DVSEC Range Registers as "decoders": Allow for
     first-generation devices that pre-date the definition of the CXL
     HDM Decoder Capability to translate the CXL DVSEC Range Registers
     into 'struct cxl_decoder' objects.

   - Set timestamp: Per spec, set the device timestamp in case of
     hotplug, or if platform-firwmare failed to set it.

   - General fixups: linux-next build issues, non-urgent fixes for
     pre-production hardware, unit test fixes, spelling and debug
     message improvements"

* tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits)
  dax/kmem: Fix leak of memory-hotplug resources
  cxl/mem: Add kdoc param for event log driver state
  cxl/trace: Add serial number to trace points
  cxl/trace: Add host output to trace points
  cxl/trace: Standardize device information output
  cxl/pci: Remove locked check for dvsec_range_allowed()
  cxl/hdm: Add emulation when HDM decoders are not committed
  cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders
  cxl/hdm: Emulate HDM decoder from DVSEC range registers
  cxl/pci: Refactor cxl_hdm_decode_init()
  cxl/port: Export cxl_dvsec_rr_decode() to cxl_port
  cxl/pci: Break out range register decoding from cxl_hdm_decode_init()
  cxl: add RAS status unmasking for CXL
  cxl: remove unnecessary calling of pci_enable_pcie_error_reporting()
  dax/hmem: build hmem device support as module if possible
  dax: cxl: add CXL_REGION dependency
  cxl: avoid returning uninitialized error code
  cxl/pmem: Fix nvdimm registration races
  cxl/mem: Fix UAPI command comment
  cxl/uapi: Tag commands from cxl_query_cmd()
  ...
parents d8e47318 e686c325
Loading
Loading
Loading
Loading
+53 −26
Original line number Diff line number Diff line
@@ -90,6 +90,21 @@ Description:
		capability.


What:		/sys/bus/cxl/devices/{port,endpoint}X/parent_dport
Date:		January, 2023
KernelVersion:	v6.3
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) CXL port objects are instantiated for each upstream port in
		a CXL/PCIe switch, and for each endpoint to map the
		corresponding memory device into the CXL port hierarchy. When a
		descendant CXL port (switch or endpoint) is enumerated it is
		useful to know which 'dport' object in the parent CXL port
		routes to this descendant. The 'parent_dport' symlink points to
		the device representing the downstream port of a CXL switch that
		routes to {port,endpoint}X.


What:		/sys/bus/cxl/devices/portX/dportY
Date:		June, 2021
KernelVersion:	v5.14
@@ -183,7 +198,7 @@ Description:

What:		/sys/bus/cxl/devices/endpointX/CDAT
Date:		July, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) If this sysfs entry is not present no DOE mailbox was
@@ -194,7 +209,7 @@ Description:

What:		/sys/bus/cxl/devices/decoderX.Y/mode
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@@ -214,7 +229,7 @@ Description:

What:		/sys/bus/cxl/devices/decoderX.Y/dpa_resource
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
@@ -225,7 +240,7 @@ Description:

What:		/sys/bus/cxl/devices/decoderX.Y/dpa_size
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@@ -245,7 +260,7 @@ Description:

What:		/sys/bus/cxl/devices/decoderX.Y/interleave_ways
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) The number of targets across which this decoder's host
@@ -260,7 +275,7 @@ Description:

What:		/sys/bus/cxl/devices/decoderX.Y/interleave_granularity
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) The number of consecutive bytes of host physical address
@@ -270,25 +285,25 @@ Description:
		interleave_granularity).


What:		/sys/bus/cxl/devices/decoderX.Y/create_pmem_region
Date:		May, 2022
KernelVersion:	v5.20
What:		/sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
Date:		May, 2022, January, 2023
KernelVersion:	v6.0 (pmem), v6.3 (ram)
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a string in the form 'regionZ' to start the process
		of defining a new persistent memory region (interleave-set)
		within the decode range bounded by root decoder 'decoderX.Y'.
		The value written must match the current value returned from
		reading this attribute. An atomic compare exchange operation is
		done on write to assign the requested id to a region and
		allocate the region-id for the next creation attempt. EBUSY is
		returned if the region name written does not match the current
		cached value.
		of defining a new persistent, or volatile memory region
		(interleave-set) within the decode range bounded by root decoder
		'decoderX.Y'. The value written must match the current value
		returned from reading this attribute. An atomic compare exchange
		operation is done on write to assign the requested id to a
		region and allocate the region-id for the next creation attempt.
		EBUSY is returned if the region name written does not match the
		current cached value.


What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(WO) Write a string in the form 'regionZ' to delete that region,
@@ -297,17 +312,18 @@ Description:

What:		/sys/bus/cxl/devices/regionZ/uuid
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a unique identifier for the region. This field must
		be set for persistent regions and it must not conflict with the
		UUID of another region.
		UUID of another region. For volatile ram regions this
		attribute is a read-only empty string.


What:		/sys/bus/cxl/devices/regionZ/interleave_granularity
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Set the number of consecutive bytes each device in the
@@ -318,7 +334,7 @@ Description:

What:		/sys/bus/cxl/devices/regionZ/interleave_ways
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Configures the number of devices participating in the
@@ -328,7 +344,7 @@ Description:

What:		/sys/bus/cxl/devices/regionZ/size
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) System physical address space to be consumed by the region.
@@ -343,9 +359,20 @@ Description:
		results in the same address being allocated.


What:		/sys/bus/cxl/devices/regionZ/mode
Date:		January, 2023
KernelVersion:	v6.3
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) The mode of a region is established at region creation time
		and dictates the mode of the endpoint decoder that comprise the
		region. For more details on the possible modes see
		/sys/bus/cxl/devices/decoderX.Y/mode


What:		/sys/bus/cxl/devices/regionZ/resource
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) A region is a contiguous partition of a CXL root decoder
@@ -357,7 +384,7 @@ Description:

What:		/sys/bus/cxl/devices/regionZ/target[0..N]
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write an endpoint decoder object name to 'targetX' where X
@@ -376,7 +403,7 @@ Description:

What:		/sys/bus/cxl/devices/regionZ/commit
Date:		May, 2022
KernelVersion:	v5.20
KernelVersion:	v6.0
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a boolean 'true' string value to this attribute to
+1 −0
Original line number Diff line number Diff line
@@ -5912,6 +5912,7 @@ M: Dan Williams <dan.j.williams@intel.com>
M:	Vishal Verma <vishal.l.verma@intel.com>
M:	Dave Jiang <dave.jiang@intel.com>
L:	nvdimm@lists.linux.dev
L:	linux-cxl@vger.kernel.org
S:	Supported
F:	drivers/dax/
+1 −1
Original line number Diff line number Diff line
@@ -71,7 +71,7 @@ obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/
obj-$(CONFIG_PARPORT)		+= parport/
obj-y				+= base/ block/ misc/ mfd/ nfc/
obj-$(CONFIG_LIBNVDIMM)		+= nvdimm/
obj-$(CONFIG_DAX)		+= dax/
obj-y				+= dax/
obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/
obj-$(CONFIG_NUBUS)		+= nubus/
obj-y				+= cxl/
+2 −2
Original line number Diff line number Diff line
@@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target)
	for (res = target->memregions.child; res; res = res->sibling) {
		int target_nid = pxm_to_node(target->memory_pxm);

		hmem_register_device(target_nid, res);
		hmem_register_resource(target_nid, res);
	}
}

@@ -869,4 +869,4 @@ static __init int hmat_init(void)
	acpi_put_table(tbl);
	return 0;
}
device_initcall(hmat_init);
subsys_initcall(hmat_init);
+3 −0
Original line number Diff line number Diff line
@@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
		host_bridge->native_dpc = 0;

	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
		host_bridge->native_cxl_error = 0;

	/*
	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
	 * exists and returns 0, we must preserve any PCI resource
Loading