Merge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm (82cce5f4) · Commits · EulixOS / Software / Kernel

Documentation/gpu/rfc/i915_gem_lmem.rst

+0 −109

Original line number	Diff line number	Diff line
		@@ -18,114 +18,5 @@ real, with all the uAPI bits is:
		* Route shmem backend over to TTM SYSTEM for discrete
		* TTM purgeable object support
		* Move i915 buddy allocator over to TTM
		* MMAP ioctl mode(see `I915 MMAP`_)
		* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
		* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
		* Add pciid for DG1 and turn on uAPI for real

		New object placement and region query uAPI
		==========================================
		Starting from DG1 we need to give userspace the ability to allocate buffers from
		device local-memory. Currently the driver supports gem_create, which can place
		buffers in system memory via shmem, and the usual assortment of other
		interfaces, like dumb buffers and userptr.

		To support this new capability, while also providing a uAPI which will work
		beyond just DG1, we propose to offer three new bits of uAPI:

		DRM_I915_QUERY_MEMORY_REGIONS
		-----------------------------
		New query ID which allows userspace to discover the list of supported memory
		regions(like system-memory and local-memory) for a given device. We identify
		each region with a class and instance pair, which should be unique. The class
		here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
		like DG1.

		Side note: The class/instance design is borrowed from our existing engine uAPI,
		where we describe every physical engine in terms of its class, and the
		particular instance, since we can have more than one per class.

		In the future we also want to expose more information which can further
		describe the capabilities of a region.

		.. kernel-doc:: include/uapi/drm/i915_drm.h
		:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions

		GEM_CREATE_EXT
		--------------
		New ioctl which is basically just gem_create but now allows userspace to provide
		a chain of possible extensions. Note that if we don't provide any extensions and
		set flags=0 then we get the exact same behaviour as gem_create.

		Side note: We also need to support PXP[1] in the near future, which is also
		applicable to integrated platforms, and adds its own gem_create_ext extension,
		which basically lets userspace mark a buffer as "protected".

		.. kernel-doc:: include/uapi/drm/i915_drm.h
		:functions: drm_i915_gem_create_ext

		I915_GEM_CREATE_EXT_MEMORY_REGIONS
		----------------------------------
		Implemented as an extension for gem_create_ext, we would now allow userspace to
		optionally provide an immutable list of preferred placements at creation time,
		in priority order, for a given buffer object. For the placements we expect
		them each to use the class/instance encoding, as per the output of the regions
		query. Having the list in priority order will be useful in the future when
		placing an object, say during eviction.

		.. kernel-doc:: include/uapi/drm/i915_drm.h
		:functions: drm_i915_gem_create_ext_memory_regions

		One fair criticism here is that this seems a little over-engineered[2]. If we
		just consider DG1 then yes, a simple gem_create.flags or something is totally
		all that's needed to tell the kernel to allocate the buffer in local-memory or
		whatever. However looking to the future we need uAPI which can also support
		upcoming Xe HP multi-tile architecture in a sane way, where there can be
		multiple local-memory instances for a given device, and so using both class and
		instance in our uAPI to describe regions is desirable, although specifically
		for DG1 it's uninteresting, since we only have a single local-memory instance.

		Existing uAPI issues
		====================
		Some potential issues we still need to resolve.

		I915 MMAP
		---------
		In i915 there are multiple ways to MMAP GEM object, including mapping the same
		object using different mapping types(WC vs WB), i.e multiple active mmaps per
		object. TTM expects one MMAP at most for the lifetime of the object. If it
		turns out that we have to backpedal here, there might be some potential
		userspace fallout.

		I915 SET/GET CACHING
		--------------------
		In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
		DG1 doesn't support non-snooped pcie transactions, so we can just always
		allocate as WB for smem-only buffers. If/when our hw gains support for
		non-snooped pcie transactions then we must fix this mode at allocation time as
		a new GEM extension.

		This is related to the mmap problem, because in general (meaning, when we're
		not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
		allocation mode.

		Possible idea is to let the kernel picks the mmap mode for userspace from the
		following table:

		smem-only: WB. Userspace does not need to call clflush.

		smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
		memory, and always give userspace a WC mapping. GPU still does snooped access
		here(assuming we can't turn it off like on DG1), which is a bit inefficient.

		lmem only: always WC

		This means on discrete you only get a single mmap mode, all others must be
		rejected. That's probably going to be a new default mode or something like
		that.

		Links
		=====
		[1] https://patchwork.freedesktop.org/series/86798/

		[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791

drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c

+40 −0

Original line number	Diff line number	Diff line
		@@ -468,6 +468,46 @@ bool amdgpu_atomfirmware_dynamic_boot_config_supported(struct amdgpu_device *ade
		return (fw_cap & ATOM_FIRMWARE_CAP_DYNAMIC_BOOT_CFG_ENABLE) ? true : false;
		}

		/*
		* Helper function to query RAS EEPROM address
		*
		* @adev: amdgpu_device pointer
		*
		* Return true if vbios supports ras rom address reporting
		*/
		bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device adev, uint8_t i2c_address)
		{
		struct amdgpu_mode_info *mode_info = &adev->mode_info;
		int index;
		u16 data_offset, size;
		union firmware_info *firmware_info;
		u8 frev, crev;

		if (i2c_address == NULL)
		return false;

		*i2c_address = 0;

		index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
		firmwareinfo);

		if (amdgpu_atom_parse_data_header(adev->mode_info.atom_context,
		index, &size, &frev, &crev, &data_offset)) {
		/* support firmware_info 3.4 + */
		if ((frev == 3 && crev >=4) \|\| (frev > 3)) {
		firmware_info = (union firmware_info *)
		(mode_info->atom_context->bios + data_offset);
		*i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr;
		}
		}

		if (*i2c_address != 0)
		return true;

		return false;
		}


		union smu_info {
		struct atom_smu_info_v3_1 v31;
		};

drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h

+1 −0

Original line number	Diff line number	Diff line
		@@ -36,6 +36,7 @@ int amdgpu_atomfirmware_get_clock_info(struct amdgpu_device *adev);
		int amdgpu_atomfirmware_get_gfx_info(struct amdgpu_device *adev);
		bool amdgpu_atomfirmware_mem_ecc_supported(struct amdgpu_device *adev);
		bool amdgpu_atomfirmware_sram_ecc_supported(struct amdgpu_device *adev);
		bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device adev, uint8_t i2c_address);
		bool amdgpu_atomfirmware_mem_training_supported(struct amdgpu_device *adev);
		bool amdgpu_atomfirmware_dynamic_boot_config_supported(struct amdgpu_device *adev);
		int amdgpu_atomfirmware_get_fw_reserved_fb_size(struct amdgpu_device *adev);

drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c

+9 −3

Original line number	Diff line number	Diff line
		@@ -299,6 +299,9 @@ int amdgpu_discovery_reg_base_init(struct amdgpu_device *adev)
		ip->major, ip->minor,
		ip->revision);

		if (le16_to_cpu(ip->hw_id) == VCN_HWID)
		adev->vcn.num_vcn_inst++;

		for (k = 0; k < num_base_address; k++) {
		/*
		* convert the endianness of base addresses in place,
		@@ -385,7 +388,7 @@ void amdgpu_discovery_harvest_ip(struct amdgpu_device *adev)
		{
		struct binary_header *bhdr;
		struct harvest_table *harvest_info;
		int i;
		int i, vcn_harvest_count = 0;

		bhdr = (struct binary_header *)adev->mman.discovery_bin;
		harvest_info = (struct harvest_table *)(adev->mman.discovery_bin +
		@@ -397,8 +400,7 @@ void amdgpu_discovery_harvest_ip(struct amdgpu_device *adev)

		switch (le32_to_cpu(harvest_info->list[i].hw_id)) {
		case VCN_HWID:
		adev->harvest_ip_mask \|= AMD_HARVEST_IP_VCN_MASK;
		adev->harvest_ip_mask \|= AMD_HARVEST_IP_JPEG_MASK;
		vcn_harvest_count++;
		break;
		case DMU_HWID:
		adev->harvest_ip_mask \|= AMD_HARVEST_IP_DMU_MASK;
		@@ -407,6 +409,10 @@ void amdgpu_discovery_harvest_ip(struct amdgpu_device *adev)
		break;
		}
		}
		if (vcn_harvest_count == adev->vcn.num_vcn_inst) {
		adev->harvest_ip_mask \|= AMD_HARVEST_IP_VCN_MASK;
		adev->harvest_ip_mask \|= AMD_HARVEST_IP_JPEG_MASK;
		}
		}

		int amdgpu_discovery_get_gfx_info(struct amdgpu_device *adev)

drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

+2 −0

Original line number	Diff line number	Diff line
		@@ -1571,6 +1571,8 @@ static int amdgpu_pmops_runtime_suspend(struct device *dev)
		pci_ignore_hotplug(pdev);
		pci_set_power_state(pdev, PCI_D3cold);
		drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF;
		} else if (amdgpu_device_supports_boco(drm_dev)) {
		/* nothing to do */
		} else if (amdgpu_device_supports_baco(drm_dev)) {
		amdgpu_device_baco_enter(drm_dev);
		}