Commit c2356983 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for 6.0:

   - Introduce a 'struct cxl_region' object with support for
     provisioning and assembling persistent memory regions.

   - Introduce alloc_free_mem_region() to accompany the existing
     request_free_mem_region() as a method to allocate physical memory
     capacity out of an existing resource.

   - Export insert_resource_expand_to_fit() for the CXL subsystem to
     late-publish CXL platform windows in iomem_resource.

   - Add a polled mode PCI DOE (Data Object Exchange) driver service and
     use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
     Table)"

* tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits)
  cxl/hdm: Fix skip allocations vs multiple pmem allocations
  cxl/region: Disallow region granularity != window granularity
  cxl/region: Fix x1 interleave to greater than x1 interleave routing
  cxl/region: Move HPA setup to cxl_region_attach()
  cxl/region: Fix decoder interleave programming
  Documentation: cxl: remove dangling kernel-doc reference
  cxl/region: describe targets and nr_targets members of cxl_region_params
  cxl/regions: add padding for cxl_rr_ep_add nested lists
  cxl/region: Fix IS_ERR() vs NULL check
  cxl/region: Fix region reference target accounting
  cxl/region: Fix region commit uninitialized variable warning
  cxl/region: Fix port setup uninitialized variable warnings
  cxl/region: Stop initializing interleave granularity
  cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
  cxl/acpi: Minimize granularity for x1 interleaves
  cxl/region: Delete 'region' attribute from root decoders
  cxl/acpi: Autoload driver for 'cxl_acpi' test devices
  cxl/region: decrement ->nr_targets on error in cxl_region_attach()
  cxl/region: prevent underflow in ways_to_cxl()
  cxl/region: uninitialized variable in alloc_hpa()
  ...
parents 5e2e7383 1cd8a253
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -516,6 +516,7 @@ ForEachMacros:
  - 'of_property_for_each_string'
  - 'of_property_for_each_u32'
  - 'pci_bus_for_each_resource'
  - 'pci_doe_for_each_off'
  - 'pcl_for_each_chunk'
  - 'pcl_for_each_segment'
  - 'pcm_for_each_format'
+265 −40
Original line number Diff line number Diff line
@@ -7,6 +7,7 @@ Description:
		all descendant memdevs for unbind. Writing '1' to this attribute
		flushes that work.


What:		/sys/bus/cxl/devices/memX/firmware_version
Date:		December, 2020
KernelVersion:	v5.12
@@ -16,6 +17,7 @@ Description:
		Memory Device Output Payload in the CXL-2.0
		specification.


What:		/sys/bus/cxl/devices/memX/ram/size
Date:		December, 2020
KernelVersion:	v5.12
@@ -25,6 +27,7 @@ Description:
		identically named field in the Identify Memory Device Output
		Payload in the CXL-2.0 specification.


What:		/sys/bus/cxl/devices/memX/pmem/size
Date:		December, 2020
KernelVersion:	v5.12
@@ -34,6 +37,7 @@ Description:
		identically named field in the Identify Memory Device Output
		Payload in the CXL-2.0 specification.


What:		/sys/bus/cxl/devices/memX/serial
Date:		January, 2022
KernelVersion:	v5.18
@@ -43,6 +47,7 @@ Description:
		capability. Mandatory for CXL devices, see CXL 2.0 8.1.12.2
		Memory Device PCIe Capabilities and Extended Capabilities.


What:		/sys/bus/cxl/devices/memX/numa_node
Date:		January, 2022
KernelVersion:	v5.18
@@ -52,114 +57,334 @@ Description:
		host PCI device for this memory device, emit the CPU node
		affinity for this device.


What:		/sys/bus/cxl/devices/*/devtype
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL device objects export the devtype attribute which mirrors
		the same value communicated in the DEVTYPE environment variable
		for uevents for devices on the "cxl" bus.
		(RO) CXL device objects export the devtype attribute which
		mirrors the same value communicated in the DEVTYPE environment
		variable for uevents for devices on the "cxl" bus.


What:		/sys/bus/cxl/devices/*/modalias
Date:		December, 2021
KernelVersion:	v5.18
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL device objects export the modalias attribute which mirrors
		the same value communicated in the MODALIAS environment variable
		for uevents for devices on the "cxl" bus.
		(RO) CXL device objects export the modalias attribute which
		mirrors the same value communicated in the MODALIAS environment
		variable for uevents for devices on the "cxl" bus.


What:		/sys/bus/cxl/devices/portX/uport
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL port objects are enumerated from either a platform firmware
		device (ACPI0017 and ACPI0016) or PCIe switch upstream port with
		CXL component registers. The 'uport' symlink connects the CXL
		portX object to the device that published the CXL port
		(RO) CXL port objects are enumerated from either a platform
		firmware device (ACPI0017 and ACPI0016) or PCIe switch upstream
		port with CXL component registers. The 'uport' symlink connects
		the CXL portX object to the device that published the CXL port
		capability.


What:		/sys/bus/cxl/devices/portX/dportY
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL port objects are enumerated from either a platform firmware
		device (ACPI0017 and ACPI0016) or PCIe switch upstream port with
		CXL component registers. The 'dportY' symlink identifies one or
		more downstream ports that the upstream port may target in its
		decode of CXL memory resources.  The 'Y' integer reflects the
		hardware port unique-id used in the hardware decoder target
		list.
		(RO) CXL port objects are enumerated from either a platform
		firmware device (ACPI0017 and ACPI0016) or PCIe switch upstream
		port with CXL component registers. The 'dportY' symlink
		identifies one or more downstream ports that the upstream port
		may target in its decode of CXL memory resources.  The 'Y'
		integer reflects the hardware port unique-id used in the
		hardware decoder target list.


What:		/sys/bus/cxl/devices/decoderX.Y
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL decoder objects are enumerated from either a platform
		(RO) CXL decoder objects are enumerated from either a platform
		firmware description, or a CXL HDM decoder register set in a
		PCIe device (see CXL 2.0 section 8.2.5.12 CXL HDM Decoder
		Capability Structure). The 'X' in decoderX.Y represents the
		cxl_port container of this decoder, and 'Y' represents the
		instance id of a given decoder resource.


What:		/sys/bus/cxl/devices/decoderX.Y/{start,size}
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		The 'start' and 'size' attributes together convey the physical
		address base and number of bytes mapped in the decoder's decode
		window. For decoders of devtype "cxl_decoder_root" the address
		range is fixed. For decoders of devtype "cxl_decoder_switch" the
		address is bounded by the decode range of the cxl_port ancestor
		of the decoder's cxl_port, and dynamically updates based on the
		active memory regions in that address space.
		(RO) The 'start' and 'size' attributes together convey the
		physical address base and number of bytes mapped in the
		decoder's decode window. For decoders of devtype
		"cxl_decoder_root" the address range is fixed. For decoders of
		devtype "cxl_decoder_switch" the address is bounded by the
		decode range of the cxl_port ancestor of the decoder's cxl_port,
		and dynamically updates based on the active memory regions in
		that address space.


What:		/sys/bus/cxl/devices/decoderX.Y/locked
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		CXL HDM decoders have the capability to lock the configuration
		until the next device reset. For decoders of devtype
		"cxl_decoder_root" there is no standard facility to unlock them.
		For decoders of devtype "cxl_decoder_switch" a secondary bus
		reset, of the PCIe bridge that provides the bus for this
		decoders uport, unlocks / resets the decoder.
		(RO) CXL HDM decoders have the capability to lock the
		configuration until the next device reset. For decoders of
		devtype "cxl_decoder_root" there is no standard facility to
		unlock them.  For decoders of devtype "cxl_decoder_switch" a
		secondary bus reset, of the PCIe bridge that provides the bus
		for this decoders uport, unlocks / resets the decoder.


What:		/sys/bus/cxl/devices/decoderX.Y/target_list
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		Display a comma separated list of the current decoder target
		configuration. The list is ordered by the current configured
		interleave order of the decoder's dport instances. Each entry in
		the list is a dport id.
		(RO) Display a comma separated list of the current decoder
		target configuration. The list is ordered by the current
		configured interleave order of the decoder's dport instances.
		Each entry in the list is a dport id.


What:		/sys/bus/cxl/devices/decoderX.Y/cap_{pmem,ram,type2,type3}
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		When a CXL decoder is of devtype "cxl_decoder_root", it
		(RO) When a CXL decoder is of devtype "cxl_decoder_root", it
		represents a fixed memory window identified by platform
		firmware. A fixed window may only support a subset of memory
		types. The 'cap_*' attributes indicate whether persistent
		memory, volatile memory, accelerator memory, and / or expander
		memory may be mapped behind this decoder's memory window.


What:		/sys/bus/cxl/devices/decoderX.Y/target_type
Date:		June, 2021
KernelVersion:	v5.14
Contact:	linux-cxl@vger.kernel.org
Description:
		When a CXL decoder is of devtype "cxl_decoder_switch", it can
		optionally decode either accelerator memory (type-2) or expander
		memory (type-3). The 'target_type' attribute indicates the
		current setting which may dynamically change based on what
		(RO) When a CXL decoder is of devtype "cxl_decoder_switch", it
		can optionally decode either accelerator memory (type-2) or
		expander memory (type-3). The 'target_type' attribute indicates
		the current setting which may dynamically change based on what
		memory regions are activated in this decode hierarchy.


What:		/sys/bus/cxl/devices/endpointX/CDAT
Date:		July, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) If this sysfs entry is not present no DOE mailbox was
		found to support CDAT data.  If it is present and the length of
		the data is 0 reading the CDAT data failed.  Otherwise the CDAT
		data is reported.


What:		/sys/bus/cxl/devices/decoderX.Y/mode
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
		translates from a host physical address range, to a device local
		address range. Device-local address ranges are further split
		into a 'ram' (volatile memory) range and 'pmem' (persistent
		memory) range. The 'mode' attribute emits one of 'ram', 'pmem',
		'mixed', or 'none'. The 'mixed' indication is for error cases
		when a decoder straddles the volatile/persistent partition
		boundary, and 'none' indicates the decoder is not actively
		decoding, or no DPA allocation policy has been set.

		'mode' can be written, when the decoder is in the 'disabled'
		state, with either 'ram' or 'pmem' to set the boundaries for the
		next allocation.


What:		/sys/bus/cxl/devices/decoderX.Y/dpa_resource
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
		and its 'dpa_size' attribute is non-zero, this attribute
		indicates the device physical address (DPA) base address of the
		allocation.


What:		/sys/bus/cxl/devices/decoderX.Y/dpa_size
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
		translates from a host physical address range, to a device local
		address range. The range, base address plus length in bytes, of
		DPA allocated to this decoder is conveyed in these 2 attributes.
		Allocations can be mutated as long as the decoder is in the
		disabled state. A write to 'dpa_size' releases the previous DPA
		allocation and then attempts to allocate from the free capacity
		in the device partition referred to by 'decoderX.Y/mode'.
		Allocate and free requests can only be performed on the highest
		instance number disabled decoder with non-zero size. I.e.
		allocations are enforced to occur in increasing 'decoderX.Y/id'
		order and frees are enforced to occur in decreasing
		'decoderX.Y/id' order.


What:		/sys/bus/cxl/devices/decoderX.Y/interleave_ways
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) The number of targets across which this decoder's host
		physical address (HPA) memory range is interleaved. The device
		maps every Nth block of HPA (of size ==
		'interleave_granularity') to consecutive DPA addresses. The
		decoder's position in the interleave is determined by the
		device's (endpoint or switch) switch ancestry. For root
		decoders their interleave is specified by platform firmware and
		they only specify a downstream target order for host bridges.


What:		/sys/bus/cxl/devices/decoderX.Y/interleave_granularity
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) The number of consecutive bytes of host physical address
		space this decoder claims at address N before the decode rotates
		to the next target in the interleave at address N +
		interleave_granularity (assuming N is aligned to
		interleave_granularity).


What:		/sys/bus/cxl/devices/decoderX.Y/create_pmem_region
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a string in the form 'regionZ' to start the process
		of defining a new persistent memory region (interleave-set)
		within the decode range bounded by root decoder 'decoderX.Y'.
		The value written must match the current value returned from
		reading this attribute. An atomic compare exchange operation is
		done on write to assign the requested id to a region and
		allocate the region-id for the next creation attempt. EBUSY is
		returned if the region name written does not match the current
		cached value.


What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(WO) Write a string in the form 'regionZ' to delete that region,
		provided it is currently idle / not bound to a driver.


What:		/sys/bus/cxl/devices/regionZ/uuid
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a unique identifier for the region. This field must
		be set for persistent regions and it must not conflict with the
		UUID of another region.


What:		/sys/bus/cxl/devices/regionZ/interleave_granularity
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Set the number of consecutive bytes each device in the
		interleave set will claim. The possible interleave granularity
		values are determined by the CXL spec and the participating
		devices.


What:		/sys/bus/cxl/devices/regionZ/interleave_ways
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Configures the number of devices participating in the
		region is set by writing this value. Each device will provide
		1/interleave_ways of storage for the region.


What:		/sys/bus/cxl/devices/regionZ/size
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) System physical address space to be consumed by the region.
		When written trigger the driver to allocate space out of the
		parent root decoder's address space. When read the size of the
		address space is reported and should match the span of the
		region's resource attribute. Size shall be set after the
		interleave configuration parameters. Once set it cannot be
		changed, only freed by writing 0. The kernel makes no guarantees
		that data is maintained over an address space freeing event, and
		there is no guarantee that a free followed by an allocate
		results in the same address being allocated.


What:		/sys/bus/cxl/devices/regionZ/resource
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RO) A region is a contiguous partition of a CXL root decoder
		address space. Region capacity is allocated by writing to the
		size attribute, the resulting physical address space determined
		by the driver is reflected here. It is therefore not useful to
		read this before writing a value to the size attribute.


What:		/sys/bus/cxl/devices/regionZ/target[0..N]
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write an endpoint decoder object name to 'targetX' where X
		is the intended position of the endpoint device in the region
		interleave and N is the 'interleave_ways' setting for the
		region. ENXIO is returned if the write results in an impossible
		to map decode scenario, like the endpoint is unreachable at that
		position relative to the root decoder interleave. EBUSY is
		returned if the position in the region is already occupied, or
		if the region is not in a state to accept interleave
		configuration changes. EINVAL is returned if the object name is
		not an endpoint decoder. Once all positions have been
		successfully written a final validation for decode conflicts is
		performed before activating the region.


What:		/sys/bus/cxl/devices/regionZ/commit
Date:		May, 2022
KernelVersion:	v5.20
Contact:	linux-cxl@vger.kernel.org
Description:
		(RW) Write a boolean 'true' string value to this attribute to
		trigger the region to transition from the software programmed
		state to the actively decoding in hardware state. The commit
		operation in addition to validating that the region is in proper
		configured state, validates that the decoders are being
		committed in spec mandated order (last committed decoder id +
		1), and checks that the hardware accepts the commit request.
		Reading this value indicates whether the region is committed or
		not.
+8 −0
Original line number Diff line number Diff line
@@ -362,6 +362,14 @@ CXL Core
.. kernel-doc:: drivers/cxl/core/mbox.c
   :doc: cxl mbox

CXL Regions
-----------
.. kernel-doc:: drivers/cxl/core/region.c
   :doc: cxl core region

.. kernel-doc:: drivers/cxl/core/region.c
   :identifiers:

External Interfaces
===================

+1 −0
Original line number Diff line number Diff line
@@ -55,6 +55,7 @@ int memory_add_physaddr_to_nid(u64 start)
{
	return hot_add_scn_to_nid(start);
}
EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
#endif

int __weak create_section_mapping(unsigned long start, unsigned long end,
+9 −0
Original line number Diff line number Diff line
@@ -2,6 +2,7 @@
menuconfig CXL_BUS
	tristate "CXL (Compute Express Link) Devices Support"
	depends on PCI
	select PCI_DOE
	help
	  CXL is a bus that is electrically compatible with PCI Express, but
	  layers three protocols on that signalling (CXL.io, CXL.cache, and
@@ -102,4 +103,12 @@ config CXL_SUSPEND
	def_bool y
	depends on SUSPEND && CXL_MEM

config CXL_REGION
	bool
	default CXL_BUS
	# For MAX_PHYSMEM_BITS
	depends on SPARSEMEM
	select MEMREGION
	select GET_FREE_REGION

endif
Loading