Skip to content
  1. Sep 21, 2018
    • Jason Gunthorpe's avatar
      RDMA/ucontext: Add a core API for mmaping driver IO memory · 5f9794dc
      Jason Gunthorpe authored
      To support disassociation and PCI hot unplug, we have to track all the
      VMAs that refer to the device IO memory. When disassociation occurs the
      VMAs have to be revised to point to the zero page, not the IO memory, to
      allow the physical HW to be unplugged.
      
      The three drivers supporting this implemented three different versions
      of this algorithm, all leaving something to be desired. This new common
      implementation has a few differences from the driver versions:
      
      - Track all VMAs, including splitting/truncating/etc. Tie the lifetime of
        the private data allocation to the lifetime of the vma. This avoids any
        tricks with setting vm_ops which Linus didn't like. (see link)
      - Support multiple mms, and support properly tracking mmaps triggered by
        processes other than the one first opening the uverbs fd. This makes
        fork behavior of disassociation enabled drivers the same as fork support
        in normal drivers.
      - Don't use crazy get_task stuff.
      - Simplify the approach for to racing between vm_ops close and
        disassociation, fixing the related bugs most of the driver
        implementations had. Since we are in core code the tracking list can be
        placed in struct ib_uverbs_ufile, which has a lifetime strictly longer
        than any VMAs created by mmap on the uverbs FD.
      
      Link: https://www.spinics.net/lists/stable/msg248747.html
      Link: https://lkml.kernel.org/r/CA+55aFxJTV_g46AQPoPXen-UPiqR1HGMZictt7VpC-SMFbm3Cw@mail.gmail.com
      
      
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5f9794dc
  2. Sep 20, 2018
  3. Sep 18, 2018
    • Jason Gunthorpe's avatar
      IB/rxe: Revise the ib_wr_opcode enum · 9a59739b
      Jason Gunthorpe authored
      
      
      This enum has become part of the uABI, as both RXE and the
      ib_uverbs_post_send() command expect userspace to supply values from this
      enum. So it should be properly placed in include/uapi/rdma.
      
      In userspace this enum is called 'enum ibv_wr_opcode' as part of
      libibverbs.h. That enum defines different values for IB_WR_LOCAL_INV,
      IB_WR_SEND_WITH_INV, and IB_WR_LSO. These were introduced (incorrectly, it
      turns out) into libiberbs in 2015.
      
      The kernel has changed its mind on the numbering for several of the IB_WC
      values over the years, but has remained stable on IB_WR_LOCAL_INV and
      below.
      
      Based on this we can conclude that there is no real user space user of the
      values beyond IB_WR_ATOMIC_FETCH_AND_ADD, as they have never worked via
      rdma-core. This is confirmed by inspection, only rxe uses the kernel enum
      and implements the latter operations. rxe has clearly never worked with
      these attributes from userspace. Other drivers that support these opcodes
      implement the functionality without calling out to the kernel.
      
      To make IB_WR_SEND_WITH_INV and related work for RXE in userspace we
      choose to renumber the IB_WR enum in the kernel to match the uABI that
      userspace has bee using since before Soft RoCE was merged. This is an
      overall simpler configuration for the whole software stack, and obviously
      can't break anything existing.
      
      Reported-by: default avatarSeth Howell <seth.howell@intel.com>
      Tested-by: default avatarSeth Howell <seth.howell@intel.com>
      Fixes: 8700e3e7
      
       ("Soft RoCE driver")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      9a59739b
  4. Sep 14, 2018
  5. Sep 13, 2018
  6. Sep 12, 2018
  7. Sep 11, 2018