Commit 4bf7fda4 authored by Robin Murphy's avatar Robin Murphy Committed by Joerg Roedel
Browse files

iommu/dma: Add config for PCI SAC address trick



For devices stuck behind a conventional PCI bus, saving extra cycles at
33MHz is probably fairly significant. However since native PCI Express
is now the norm for high-performance devices, the optimisation to always
prefer 32-bit addresses for the sake of avoiding DAC is starting to look
rather anachronistic. Technically 32-bit addresses do have shorter TLPs
on PCIe, but unless the device is saturating its link bandwidth with
small transfers it seems unlikely that the difference is appreciable.

What definitely is appreciable, however, is that the IOVA allocator
doesn't behave all that well once the 32-bit space starts getting full.
As DMA working sets get bigger, this optimisation increasingly backfires
and adds considerable overhead to the dma_map path for use-cases like
high-bandwidth networking. We've increasingly bandaged the allocator
in attempts to mitigate this, but it remains fundamentally at odds with
other valid requirements to try as hard as possible to satisfy a request
within the given limit; what we really need is to just avoid this odd
notion of a speculative allocation when it isn't beneficial anyway.

Unfortunately that's where things get awkward... Having been present on
x86 for 15 years or so now, it turns out there are systems which fail to
properly define the upper limit of usable IOVA space for certain devices
and this trick was the only thing letting them work OK. I had a similar
ulterior motive for a couple of early arm64 systems when originally
adding it to iommu-dma, but those really should be fixed with proper
firmware bindings by now. Let's be brave and default it to off in the
hope that CI systems and developers will find and fix those bugs, but
expect that desktop-focused distro configs are likely to want to turn
it back on for maximum compatibility.

Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/3f06994f9f370f9d35b2630ab75171ecd2065621.1654782107.git.robin.murphy@arm.com


Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
parent 822242e6
Loading
Loading
Loading
Loading
+26 −0
Original line number Diff line number Diff line
@@ -144,6 +144,32 @@ config IOMMU_DMA
	select IRQ_MSI_IOMMU
	select NEED_SG_DMA_LENGTH

config IOMMU_DMA_PCI_SAC
	bool "Enable 64-bit legacy PCI optimisation by default"
	depends on IOMMU_DMA
	help
	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
	  wherein the DMA API layer will always first try to allocate a 32-bit
	  DMA address suitable for a single address cycle, before falling back
	  to allocating from the device's full usable address range. If your
	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
	  address cycles reduces DMA throughput significantly, this may be
	  beneficial to overall performance.

	  If you have a modern PCI Express based system, this feature mostly just
	  represents extra overhead in the allocation path for no practical
	  benefit, and it should usually be preferable to say "n" here.

	  However, beware that this feature has also historically papered over
	  bugs where the IOMMU address width and/or device DMA mask is not set
	  correctly. If device DMA problems and IOMMU faults start occurring
	  after disabling this option, it is almost certainly indicative of a
	  latent driver or firmware/BIOS bug, which would previously have only
	  manifested with several gigabytes worth of concurrent DMA mappings.

	  If this option is not set, the feature can still be re-enabled at
	  boot time with the "iommu.forcedac=0" command-line argument.

# Shared Virtual Addressing
config IOMMU_SVA
	bool
+1 −1
Original line number Diff line number Diff line
@@ -67,7 +67,7 @@ struct iommu_dma_cookie {
};

static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
bool iommu_dma_forcedac __read_mostly;
bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);

static int __init iommu_dma_forcedac_setup(char *str)
{