Commit ae0215b2 authored by Alexey Kardashevskiy's avatar Alexey Kardashevskiy Committed by Alex Williamson
Browse files

vfio-pci: Allow mmap of MSIX BAR

At the moment we unconditionally avoid mapping MSIX data of a BAR and
emulate MSIX table in QEMU. However it is 1) not always necessary as
a platform may provide a paravirt interface for MSIX configuration;
2) can affect the speed of MMIO access by emulating them in QEMU when
frequently accessed registers share same system page with MSIX data,
this is particularly a problem for systems with the page size bigger
than 4KB.

A new capability - VFIO_REGION_INFO_CAP_MSIX_MAPPABLE - has been added
to the kernel [1] which tells the userspace that mapping of the MSIX data
is possible now. This makes use of it so from now on QEMU tries mapping
the entire BAR as a whole and emulate MSIX on top of that.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6



Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
parent 567b5b30
Loading
Loading
Loading
Loading
+15 −0
Original line number Diff line number Diff line
@@ -1471,6 +1471,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
    return -ENODEV;
}

bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type)
{
    struct vfio_region_info *info = NULL;
    bool ret = false;

    if (!vfio_get_region_info(vbasedev, region, &info)) {
        if (vfio_get_region_info_cap(info, cap_type)) {
            ret = true;
        }
        g_free(info);
    }

    return ret;
}

/*
 * Interfaces for IBM EEH (Enhanced Error Handling)
 */
+9 −0
Original line number Diff line number Diff line
@@ -1294,6 +1294,15 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
    off_t start, end;
    VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;

    /*
     * If the host driver allows mapping of a MSIX data, we are going to
     * do map the entire BAR and emulate MSIX table on top of that.
     */
    if (vfio_has_region_cap(&vdev->vbasedev, region->nr,
                            VFIO_REGION_INFO_CAP_MSIX_MAPPABLE)) {
        return;
    }

    /*
     * We expect to find a single mmap covering the whole BAR, anything else
     * means it's either unsupported or already setup.
+1 −0
Original line number Diff line number Diff line
@@ -193,6 +193,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
                         struct vfio_region_info **info);
int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
                             uint32_t subtype, struct vfio_region_info **info);
bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type);
#endif
extern const MemoryListener vfio_prereg_listener;