Skip to content
  1. Apr 20, 2017
  2. Apr 07, 2017
    • Nate Watterson's avatar
      iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range · 5016bdb7
      Nate Watterson authored
      
      
      Normally, calling alloc_iova() using an iova_domain with insufficient
      pfns remaining between start_pfn and dma_limit will fail and return a
      NULL pointer. Unexpectedly, if such a "full" iova_domain contains an
      iova with pfn_lo == 0, the alloc_iova() call will instead succeed and
      return an iova containing invalid pfns.
      
      This is caused by an underflow bug in __alloc_and_insert_iova_range()
      that occurs after walking the "full" iova tree when the search ends
      at the iova with pfn_lo == 0 and limit_pfn is then adjusted to be just
      below that (-1). This (now huge) limit_pfn gives the impression that a
      vast amount of space is available between it and start_pfn and thus
      a new iova is allocated with the invalid pfn_hi value, 0xFFF.... .
      
      To rememdy this, a check is introduced to ensure that adjustments to
      limit_pfn will not underflow.
      
      This issue has been observed in the wild, and is easily reproduced with
      the following sample code.
      
      	struct iova_domain *iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
      	struct iova *rsvd_iova, *good_iova, *bad_iova;
      	unsigned long limit_pfn = 3;
      	unsigned long start_pfn = 1;
      	unsigned long va_size = 2;
      
      	init_iova_domain(iovad, SZ_4K, start_pfn, limit_pfn);
      	rsvd_iova = reserve_iova(iovad, 0, 0);
      	good_iova = alloc_iova(iovad, va_size, limit_pfn, true);
      	bad_iova = alloc_iova(iovad, va_size, limit_pfn, true);
      
      Prior to the patch, this yielded:
      	*rsvd_iova == {0, 0}   /* Expected */
      	*good_iova == {2, 3}   /* Expected */
      	*bad_iova  == {-2, -1} /* Oh no... */
      
      After the patch, bad_iova is NULL as expected since inadequate
      space remains between limit_pfn and start_pfn after allocating
      good_iova.
      
      Signed-off-by: default avatarNate Watterson <nwatters@codeaurora.org>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      5016bdb7
  3. Apr 03, 2017
  4. Mar 22, 2017
    • Robin Murphy's avatar
      iommu/dma: Make PCI window reservation generic · 273df963
      Robin Murphy authored
      
      
      Now that we're applying the IOMMU API reserved regions to our IOVA
      domains, we shouldn't need to privately special-case PCI windows, or
      indeed anything else which isn't specific to our iommu-dma layer.
      However, since those aren't IOMMU-specific either, rather than start
      duplicating code into IOMMU drivers let's transform the existing
      function into an iommu_get_resv_regions() helper that they can share.
      
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      273df963
    • Robin Murphy's avatar
      iommu/dma: Handle IOMMU API reserved regions · 7c1b058c
      Robin Murphy authored
      
      
      Now that it's simple to discover the necessary reservations for a given
      device/IOMMU combination, let's wire up the appropriate handling. Basic
      reserved regions and direct-mapped regions we simply have to carve out
      of IOVA space (the IOMMU core having already mapped the latter before
      attaching the device). For hardware MSI regions, we also pre-populate
      the cookie with matching msi_pages. That way, irqchip drivers which
      normally assume MSIs to require mapping at the IOMMU can keep working
      without having to special-case their iommu_dma_map_msi_msg() hook, or
      indeed be aware at all of quirks preventing the IOMMU from translating
      certain addresses.
      
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      7c1b058c
    • Robin Murphy's avatar
      iommu/dma: Don't reserve PCI I/O windows · 938f1bbe
      Robin Murphy authored
      Even if a host controller's CPU-side MMIO windows into PCI I/O space do
      happen to leak into PCI memory space such that it might treat them as
      peer addresses, trying to reserve the corresponding I/O space addresses
      doesn't do anything to help solve that problem. Stop doing a silly thing.
      
      Fixes: fade1ec0
      
       ("iommu/dma: Avoid PCI host bridge windows")
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      938f1bbe
    • Robin Murphy's avatar
      iommu: Disambiguate MSI region types · 9d3a4de4
      Robin Murphy authored
      The introduction of reserved regions has left a couple of rough edges
      which we could do with sorting out sooner rather than later. Since we
      are not yet addressing the potential dynamic aspect of software-managed
      reservations and presenting them at arbitrary fixed addresses, it is
      incongruous that we end up displaying hardware vs. software-managed MSI
      regions to userspace differently, especially since ARM-based systems may
      actually require one or the other, or even potentially both at once,
      (which iommu-dma currently has no hope of dealing with at all). Let's
      resolve the former user-visible inconsistency ASAP before the ABI has
      been baked into a kernel release, in a way that also lays the groundwork
      for the latter shortcoming to be addressed by follow-up patches.
      
      For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
      IOMMU_RESV_MSI to describe the hardware type, and document everything a
      little bit. Since the x86 MSI remapping hardware falls squarely under
      this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
      so that we tell the same story to userspace across all platforms.
      
      Secondly, as the various region types require quite different handling,
      and it really makes little sense to ever try combining them, convert the
      bitfield-esque #defines to a plain enum in the process before anyone
      gets the wrong impression.
      
      Fixes: d30ddcaa
      
       ("iommu: Add a new type field in iommu_resv_region")
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: David Woodhouse <dwmw2@infradead.org>
      CC: kvm@vger.kernel.org
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      9d3a4de4
    • Marek Szyprowski's avatar
      iommu/exynos: Workaround FLPD cache flush issues for SYSMMU v5 · cd37a296
      Marek Szyprowski authored
      For some unknown reasons, in some cases, FLPD cache invalidation doesn't
      work properly with SYSMMU v5 controllers found in Exynos5433 SoCs. This
      can be observed by a firmware crash during initialization phase of MFC
      video decoder available in the mentioned SoCs when IOMMU support is
      enabled. To workaround this issue perform a full TLB/FLPD invalidation
      in case of replacing any first level page descriptors in case of SYSMMU v5.
      
      Fixes: 740a01ee
      
       ("iommu/exynos: Add support for v5 SYSMMU")
      CC: stable@vger.kernel.org # v4.10+
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: default avatarAndrzej Hajda <a.hajda@samsung.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      cd37a296
    • Marek Szyprowski's avatar
      iommu/exynos: Block SYSMMU while invalidating FLPD cache · 7d2aa6b8
      Marek Szyprowski authored
      Documentation specifies that SYSMMU should be in blocked state while
      performing TLB/FLPD cache invalidation, so add needed calls to
      sysmmu_block/unblock.
      
      Fixes: 66a7ed84
      
       ("iommu/exynos: Apply workaround of caching fault page table entries")
      CC: stable@vger.kernel.org # v4.10+
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      7d2aa6b8
    • Koos Vriezen's avatar
      iommu/vt-d: Fix NULL pointer dereference in device_to_iommu · 5003ae1e
      Koos Vriezen authored
      The function device_to_iommu() in the Intel VT-d driver
      lacks a NULL-ptr check, resulting in this oops at boot on
      some platforms:
      
       BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab
       IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
       PGD 0
      
       [...]
      
       Call Trace:
         ? find_or_alloc_domain.constprop.29+0x1a/0x300
         ? dw_dma_probe+0x561/0x580 [dw_dmac_core]
         ? __get_valid_domain_for_dev+0x39/0x120
         ? __intel_map_single+0x138/0x180
         ? intel_alloc_coherent+0xb6/0x120
         ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm]
         ? mutex_lock+0x9/0x30
         ? kernfs_add_one+0xdb/0x130
         ? devres_add+0x19/0x60
         ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm]
         ? platform_drv_probe+0x30/0x90
         ? driver_probe_device+0x1ed/0x2b0
         ? __driver_attach+0x8f/0xa0
         ? driver_probe_device+0x2b0/0x2b0
         ? bus_for_each_dev+0x55/0x90
         ? bus_add_driver+0x110/0x210
         ? 0xffffffffa11ea000
         ? driver_register+0x52/0xc0
         ? 0xffffffffa11ea000
         ? do_one_initcall+0x32/0x130
         ? free_vmap_area_noflush+0x37/0x70
         ? kmem_cache_alloc+0x88/0xd0
         ? do_init_module+0x51/0x1c4
         ? load_module+0x1ee9/0x2430
         ? show_taint+0x20/0x20
         ? kernel_read_file+0xfd/0x190
         ? SyS_finit_module+0xa3/0xb0
         ? do_syscall_64+0x4a/0xb0
         ? entry_SYSCALL64_slow_path+0x25/0x25
       Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49
       RIP  [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
        RSP <ffffc90001457a78>
       CR2: 00000000000007ab
       ---[ end trace 16f974b6d58d0aad ]---
      
      Add the missing pointer check.
      
      Fixes: 1c387188
      
       ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions")
      Signed-off-by: default avatarKoos Vriezen <koos.vriezen@gmail.com>
      Cc: stable@vger.kernel.org # 4.8.15+
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      5003ae1e
  5. Mar 20, 2017
    • Linus Torvalds's avatar
      Linux 4.11-rc3 · 97da3854
      Linus Torvalds authored
      v4.11-rc3
      97da3854
    • Linus Torvalds's avatar
      mm/swap: don't BUG_ON() due to uninitialized swap slot cache · 452b94b8
      Linus Torvalds authored
      
      
      This BUG_ON() triggered for me once at shutdown, and I don't see a
      reason for the check.  The code correctly checks whether the swap slot
      cache is usable or not, so an uninitialized swap slot cache is not
      actually problematic afaik.
      
      I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
      I'm not sure why that seemingly pointless check was there.  I suspect
      the real fix is to just remove it entirely, but for now we'll warn about
      it but not bring the machine down.
      
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      452b94b8
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a07a6e41
      Linus Torvalds authored
      Pull more powerpc fixes from Michael Ellerman:
       "A couple of minor powerpc fixes for 4.11:
      
         - wire up statx() syscall
      
         - don't print a warning on memory hotplug when HPT resizing isn't
           available
      
        Thanks to: David Gibson, Chandan Rajendra"
      
      * tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/pseries: Don't give a warning when HPT resizing isn't available
        powerpc: Wire up statx() syscall
      a07a6e41
    • Linus Torvalds's avatar
      Merge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 4571bc5a
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
      
       - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
         modules with CONFIG_MODVERSIONS.
      
       - Dave Anglin optimized the cache flushing for vmap ranges.
      
       - Arvind Yadav provided a fix for a potential NULL pointer dereference
         in the parisc perf code (and some code cleanups).
      
       - I wired up the new statx system call, fixed some compiler warnings
         with the access_ok() macro and fixed shutdown code to really halt a
         system at shutdown instead of crashing & rebooting.
      
      * 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix system shutdown halt
        parisc: perf: Fix potential NULL pointer dereference
        parisc: Avoid compiler warnings with access_ok()
        parisc: Wire up statx system call
        parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
        parisc: support R_PARISC_SECREL32 relocation in modules
      4571bc5a
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 8aa34172
      Linus Torvalds authored
      Pull SCSI target fixes from Nicholas Bellinger:
       "The bulk of the changes are in qla2xxx target driver code to address
        various issues found during Cavium/QLogic's internal testing (stable
        CC's included), along with a few other stability and smaller
        miscellaneous improvements.
      
        There are also a couple of different patch sets from Mike Christie,
        which have been a result of his work to use target-core ALUA logic
        together with tcm-user backend driver.
      
        Finally, a patch to address some long standing issues with
        pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
        which will make folks using physical (or virtual) magnetic tape happy"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
        qla2xxx: Update driver version to 9.00.00.00-k
        qla2xxx: Fix delayed response to command for loop mode/direct connect.
        qla2xxx: Change scsi host lookup method.
        qla2xxx: Add DebugFS node to display Port Database
        qla2xxx: Use IOCB interface to submit non-critical MBX.
        qla2xxx: Add async new target notification
        qla2xxx: Export DIF stats via debugfs
        qla2xxx: Improve T10-DIF/PI handling in driver.
        qla2xxx: Allow relogin to proceed if remote login did not finish
        qla2xxx: Fix sess_lock & hardware_lock lock order problem.
        qla2xxx: Fix inadequate lock protection for ABTS.
        qla2xxx: Fix request queue corruption.
        qla2xxx: Fix memory leak for abts processing
        qla2xxx: Allow vref count to timeout on vport delete.
        tcmu: Convert cmd_time_out into backend device attribute
        tcmu: make cmd timeout configurable
        tcmu: add helper to check if dev was configured
        target: fix race during implicit transition work flushes
        target: allow userspace to set state to transitioning
        target: fix ALUA transition timeout handling
        ...
      8aa34172
    • Linus Torvalds's avatar
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 1b8df619
      Linus Torvalds authored
      Pull device-dax fixes from Dan Williams:
       "The device-dax driver was not being careful to handle falling back to
        smaller fault-granularity sizes.
      
        The driver already fails fault attempts that are smaller than the
        device's alignment, but it also needs to handle the cases where a
        larger page mapping could be established. For simplicity of the
        immediate fix the implementation just signals VM_FAULT_FALLBACK until
        fault-size == device-alignment.
      
        One fix is for -stable to address pmd-to-pte fallback from the
        original implementation, another fix is for the new (introduced in
        4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
        ride.
      
        These have received a build success notification from the kbuild
        robot"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        device-dax: fix debug output typo
        device-dax: fix pud fault fallback handling
        device-dax: fix pmd/pte fault fallback handling
      1b8df619
  6. Mar 19, 2017