Skip to content
  1. May 28, 2019
    • Ville Syrjälä's avatar
      drm/i915: Make sure we have enough memory bandwidth on ICL · c457d9cf
      Ville Syrjälä authored
      
      
      ICL has so many planes that it can easily exceed the maximum
      effective memory bandwidth of the system. We must therefore check
      that we don't exceed that limit.
      
      The algorithm is very magic number heavy and lacks sufficient
      explanation for now. We also have no sane way to query the
      memory clock and timings, so we must rely on a combination of
      raw readout from the memory controller and hardcoded assumptions.
      The memory controller values obviously change as the system
      jumps between the different SAGV points, so we try to stabilize
      it first by disabling SAGV for the duration of the readout.
      
      The utilized bandwidth is tracked via a device wide atomic
      private object. That is actually not robust because we can't
      afford to enforce strict global ordering between the pipes.
      Thus I think I'll need to change this to simply chop up the
      available bandwidth between all the active pipes. Each pipe
      can then do whatever it wants as long as it doesn't exceed
      its budget. That scheme will also require that we assume that
      any number of planes could be active at any time.
      
      TODO: make it robust and deal with all the open questions
      
      v2: Sleep longer after disabling SAGV
      v3: Poll for the dclk to get raised (seen it take 250ms!)
          If the system has 2133MT/s memory then we pointlessly
          wait one full second :(
      v4: Use the new pcode interface to get the qgv points rather
          that using hardcoded numbers
      v5: Move the pcode stuff into intel_bw.c (Matt)
          s/intel_sagv_info/intel_qgv_info/
          Do the NV12/P010 as per spec for now (Matt)
          s/IS_ICELAKE/IS_GEN11/
      v6: Ignore bandwidth limits if the pcode query fails
      
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Acked-by: default avatarClint Taylor <Clinton.A.Taylor@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524153614.32410-1-ville.syrjala@linux.intel.com
      c457d9cf
    • Ville Syrjälä's avatar
      drm/i915: Make sandybridge_pcode_read() deal with the second data register · d284d514
      Ville Syrjälä authored
      
      
      The pcode mailbox has two data registers. So far we've only ever used
      the one, but that's about to change. Expose the second data register to
      the callers of sandybridge_pcode_read().
      
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarClint Taylor <Clinton.A.Taylor@intel.com>
      Reviewed-by: default avatarRadhakrishna Sripada <radhakrishna.sripada@intel.com>
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521164025.30225-1-ville.syrjala@linux.intel.com
      d284d514
  2. May 27, 2019
    • Imre Deak's avatar
      drm/i915/icl: Fix AUX-B HW not done issue w/o AUX-A · 4361ccac
      Imre Deak authored
      
      
      Atm AUX-B transfers can fail with the following error if AUX-A is not
      enabled:
      
      [  594.594108] [drm:intel_dp_aux_xfer [i915]] dp_aux_ch timeout status 0x7c2003ff
      [  594.615854] [drm:intel_dp_aux_xfer [i915]] *ERROR* dp aux hw did not signal timeout!
      [  594.632851] [drm:intel_dp_aux_xfer [i915]] *ERROR* dp aux hw did not signal timeout!
      [  594.632915] [drm:intel_dp_aux_xfer [i915]] *ERROR* dp_aux_ch not done status 0xac2003ff
      [  594.641786] ------------[ cut here ]------------
      [  594.641790] dp_aux_ch not started status 0xac2003ff
      [  594.641874] WARNING: CPU: 4 PID: 1366 at drivers/gpu/drm/i915/intel_dp.c:1268 intel_dp_aux_xfer+0x232/0x890 [i915]
      
      Ville noticed this issue already earlier and managed to work around it
      by keeping AUX-A always powered whenever AUX-B was used. He also
      reported the issue to HW folks and they have now root caused the problem
      and updated BSpec with a fix (see internal BSpec/Index/21257,
      HSD/1607152412).
      
      I noticed the same error - even with the WA being applied - while doing
      AUX transfers with Chamelium being connected with a DP cable to the
      source but letting Chamelium imitate an unplug. This is probably some
      unstandard way on Chamelium's behalf of disconnecting itself from the
      AUX pins. For instance it could still pull on the AUX pins which would
      prevent the source from detecting AUX timeouts in the proper way,
      leading to the ERRORs or WARNs seen in the logs in the Reference: bug
      below.
      
      In case I disconnect the sink properly (the cable itself, not via the
      Chamelium unplug xmlrpc command) then the AUX timeout signaling works
      properly and so there won't be any ERRORs/WARNs emitted.
      
      Reference: https://bugs.freedesktop.org/show_bug.cgi?id=110718
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Reported-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524173532.6444-1-imre.deak@intel.com
      4361ccac
    • Jani Nikula's avatar
      drm/i915: make REG_BIT() and REG_GENMASK() work with variables · 591d4dc4
      Jani Nikula authored
      
      
      REG_BIT() and REG_GENMASK() were intended to work with both constant
      expressions and otherwise, with the former having extra compile time
      checks for the bit ranges. Incredibly, the result of
      __builtin_constant_p() is not an integer constant expression when given
      a non-constant expression, leading to errors in BUILD_BUG_ON_ZERO().
      
      Replace __builtin_constant_p() with the __is_constexpr() magic spell.
      
      Reported-by: default avatarVille Syrjala <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524185253.1088-1-jani.nikula@intel.com
      591d4dc4
    • Colin Ian King's avatar
      drm/i915/gtt: set err to -ENOMEM on memory allocation failure · c2df2201
      Colin Ian King authored
      Currently when the allocation of ppgtt->work fails the error return
      path via err_free returns an uninitialized value in err. Fix this
      by setting err to the appropriate error return of -ENOMEM.
      
      Addresses-Coverity: ("Uninitialized scalar variable")
      Fixes: d3622099
      
       ("drm/i915/gtt: Always acquire struct_mutex for gen6_ppgtt_cleanup")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524212627.24256-1-colin.king@canonical.com
      c2df2201
    • Hans de Goede's avatar
      drm/i915/dsi: Call drm_connector_cleanup on vlv_dsi_init error exit path · 5c27de1d
      Hans de Goede authored
      
      
      If we exit vlv_dsi_init() because we failed to find a fixed_mode, then
      we've already called drm_connector_init() and we should call
      drm_connector_cleanup() to unregister the connector object.
      
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190524163518.17545-1-hdegoede@redhat.com
      5c27de1d
  3. May 25, 2019
  4. May 24, 2019
  5. May 23, 2019
    • Jani Nikula's avatar
      drm/i915: remove duplicate typedef for intel_wakeref_t · 09a93ef3
      Jani Nikula authored
      
      
      Fix the duplicate typedef for intel_wakeref_t leading to Clang build
      issues. While at it, actually make the intel_runtime_pm.h header
      self-contained, which was claimed in the commit being fixed.
      
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      References: http://mid.mail-archive.com/20190521183850.GA9157@archlinux-epyc
      References: https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/201754420#L2435
      Fixes: 0d5adc5f
      
       ("drm/i915: extract intel_runtime_pm.h from intel_drv.h")
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190522103505.2082-1-jani.nikula@intel.com
      09a93ef3
    • Jani Nikula's avatar
      drm/i915: Update DRIVER_DATE to 20190523 · cfc0e7bb
      Jani Nikula authored
      
      
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      cfc0e7bb
    • Gwan-gyeong Mun's avatar
      drm/i915/dp: Support DP ports YUV 4:2:0 output to GEN11 · 47d0ccec
      Gwan-gyeong Mun authored
      
      
      Bspec describes that GEN10 only supports capability of YUV 4:2:0 output to
      HDMI port and GEN11 supports capability of YUV 4:2:0 output to both DP and
      HDMI ports.
      
      v2: Minor style fix.
      
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-7-gwan-gyeong.mun@intel.com
      47d0ccec
    • Gwan-gyeong Mun's avatar
      drm/i915/dp: Change a link bandwidth computation for DP · 16668f48
      Gwan-gyeong Mun authored
      
      
      Data M/N calculations were assumed a bpp as RGB format. But when we are
      using YCbCr 4:2:0 output format on DP, we should change bpp calculations
      as YCbCr 4:2:0 format. The pipe_bpp value was assumed RGB format,
      therefore, it was multiplied with 3. But YCbCr 4:2:0 requires a multiplier
      value to 1.5.
      Therefore we need to divide pipe_bpp to 2 while DP output uses YCbCr4:2:0
      format.
       - RGB format bpp = bpc x 3
       - YCbCr 4:2:0 format bpp = bpc x 1.5
      
      But Link M/N values are calculated and applied based on the Full Clock for
      YCbCr 4:2:0. And DP YCbCr 4:2:0 does not need to pixel clock double for
      a dotclock caluation. Only for HDMI YCbCr 4:2:0 needs to pixel clock double
      for a dot clock calculation.
      
      It only affects dp and edp port which use YCbCr 4:2:0 output format.
      And for now, it does not consider a use case of DSC + YCbCr 4:2:0.
      
      v2:
        Addressed review comments from Ville.
        Remove a changing of pipe_bpp on intel_ddi_set_pipe_settings().
        Because the pipe is running at the full bpp, keep pipe_bpp as RGB
        even though YCbCr 4:2:0 output format is used.
        Add a link bandwidth computation for YCbCr4:2:0 output format.
      
      v3:
        Addressed reivew comments from Ville.
        In order to make codes simple, it adds and uses intel_dp_output_bpp()
        function.
      
      v6:
        Link M/N values are calculated and applied based on the Full Clock for
        YCbCr420. The Bit per Pixel needs to be adjusted for YUV420 mode as it
        requires only half of the RGB case.
          - Link M/N values are calculated and applied based on the Full Clock
          - Data M/N values needs to be calculated considering the data is half
            due to subsampling
        Remove a doubling of pixel clock on a dot clock calculator for
        DP YCbCr 4:2:0.
        Rebase and remove a duplicate setting of vsc_sdp.DB17.
        Add a setting of dynamic range bit to  vsc_sdp.DB17.
        Change Content Type bit to "Graphics" from "Not defined".
        Change a dividing of pipe_bpp to muliplying to constant values on a
        switch-case statement.
      
      v7:
        Addressed review comments from Ville.
        Move a setting of dynamic range bit and a setting of bpc which is based
        on pipe_bpp to a "drm/i915/dp: Program VSC Header and DB for Pixel
        Encoding/Colorimetry Format" commit.
        Change Content Type bit to "Not defined" from "Graphics".
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-6-gwan-gyeong.mun@intel.com
      16668f48
    • Gwan-gyeong Mun's avatar
      drm/i915/dp: Add a support of YCBCR 4:2:0 to DP MSA · ec4401d3
      Gwan-gyeong Mun authored
      
      
      When YCBCR 4:2:0 outputs is used for DP, we should program YCBCR 4:2:0 to
      MSA and VSC SDP.
      
      As per DP 1.4a spec section 2.2.4.3 [MSA Field for Indication of Color
      Encoding Format and Content Color Gamut] while sending YCBCR 420 signals
      we should program MSA MISC1 fields which indicate VSC SDP for the Pixel
      Encoding/Colorimetry Format.
      
      v2: Block comment style fix.
      
      v6:
        Fix an wrong setting of MSA MISC1 fields for Pixel Encoding/Colorimetry
        Format indication. As per DP 1.4a spec Table 2-96 [MSA MISC1 and MISC0
        Fields for Pixel Encoding/Colorimetry Format Indication]
        When MISC1, bit 6, is Set to 1, a Source device uses a VSC SDP to
        indicate the Pixel Encoding/Colorimetry Format. On the wrong version
        it set a bit 5 of MISC1, now it set a bit 6 of MISC1.
      
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-5-gwan-gyeong.mun@intel.com
      ec4401d3
    • Gwan-gyeong Mun's avatar
      drm/i915/dp: Program VSC Header and DB for Pixel Encoding/Colorimetry Format · 3c053a96
      Gwan-gyeong Mun authored
      
      
      Function intel_pixel_encoding_setup_vsc handles vsc header and data block
      setup for pixel encoding / colorimetry format.
      
      Setup VSC header and data block in function intel_pixel_encoding_setup_vsc
      for pixel encoding / colorimetry format as per dp 1.4a spec,
      section 2.2.5.7.1, table 2-119: VSC SDP Header Bytes, section 2.2.5.7.5,
      table 2-120:VSC SDP Payload for DB16 through DB18.
      
      v2:
        Minor style fix. [Maarten]
        Refer to commit ids instead of patchwork. [Maarten]
      
      v6: Rebase
      
      v7:
        Rebase and addressed review comments from Ville.
        Use a structure initializer instead of memset().
        Fix non-standard comment format.
        Remove a referring to specific commit.
        Add a setting of dynamic range bit to  vsc_sdp.DB17.
        Add a setting of bpc which is based on pipe_bpp.
        Remove duplicated checking of connector's ycbcr_420_allowed from
        intel_pixel_encoding_setup_vsc(). It is already checked from
        intel_dp_ycbcr420_config().
        Remove comments for VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED. It is
        already implemented on intel_dp_get_colorimetry_status().
      
      v8:
        A missing of setting bpc to VSC setup is the pretty fatal case, it
        replaces DRM_DEBUG_KMS() to MISSING_CASE(). [Maarten]
      
      v9: Use a changed member name of struct dp_sdp. it renamed to db from DB.
      
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-4-gwan-gyeong.mun@intel.com
      3c053a96
    • Gwan-gyeong Mun's avatar
      drm: Rename struct edp_vsc_psr to struct dp_sdp · 4d432f95
      Gwan-gyeong Mun authored
      
      
      VSC SDP Payload for PSR is one of data block type of SDP (Secondaray Data
      Packet). In order to generalize SDP packet structure name, it renames
      struct edp_vsc_psr to struct dp_sdp. And each SDP data blocks have
      different usages, each SDP type has different reserved data blocks and
      Video_Stream_Configuration Extension VESA SDP might use all of Data Blocks
      as Extended INFORFRAME Data Byte. so it makes Data Block variables as
      array type. And it adds comments of details of DB of VSC SDP Payload
      for Pixel Encoding/Colorimetry Format. This comments follows DP 1.4a spec,
      section 2.2.5.7.5, chapter "VSC SDP Payload for Pixel Encoding/Colorimetry
      Format".
      
      v7: Addressed review comments from Ville.
      
      v9: Rename a member value name DB to db on struct dp_sdp [Laurent]
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Acked-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Acked-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-3-gwan-gyeong.mun@intel.com
      4d432f95
    • Gwan-gyeong Mun's avatar
      drm/i915/dp: Add a config function for YCBCR420 outputs · 8e9d645c
      Gwan-gyeong Mun authored
      
      
      This patch checks a support of YCBCR420 outputs on an encoder level.
      If the input mode is YCBCR420-only mode then it prepares DP as an YCBCR420
      output, else it continues with RGB output mode.
      It set output_format to INTEL_OUTPUT_FORMAT_YCBCR420 in order to using
      a pipe scaler as RGB to YCbCr 4:4:4.
      
      v2:
        Addressed review comments from Ville.
        Style fixed with few naming.
        %s/config/crtc_state/
        %s/intel_crtc/crtc/
        If lscon is active, it makes not to call intel_dp_ycbcr420_config()
        to avoid to clobber of lspcon_ycbcr420_config() routine.
        And it move the 420_only check into the intel_dp_ycbcr420_config().
      
      v3: Fix uninitialized return value and it is reported by Dan Carpenter.
      
      v4:
        Addressed review comments from Ville.
        In order to avoid the extra indentation, it inverts if-clause on
        intel_dp_ycbcr420_config().
        Remove the error print where no errors print are allowed.
      
      v6: Rebase
      
      v7:
        Move intel_dp_get_colorimetry_status() to intel_dp from intel_psr.
        intel_dp_get_colorimetry_status() checks
        VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED bit in the
        DPRX_FEATURE_ENUMERATION_LIST register.
        And intel_dp_ycbcr420_config() uses intel_dp_get_colorimetry_status().
      
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarGwan-gyeong Mun <gwan-gyeong.mun@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521121721.32010-2-gwan-gyeong.mun@intel.com
      8e9d645c
  6. May 22, 2019
    • Tvrtko Ursulin's avatar
      drm/i915: Engine discovery query · c5d3e39c
      Tvrtko Ursulin authored
      
      
      Engine discovery query allows userspace to enumerate engines, probe their
      configuration features, all without needing to maintain the internal PCI
      ID based database.
      
      A new query for the generic i915 query ioctl is added named
      DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
      drm_i915_query_engine_info. The address of latter should be passed to the
      kernel in the query.data_ptr field, and should be large enough for the
      kernel to fill out all known engines as struct drm_i915_engine_info
      elements trailing the query.
      
      As with other queries, setting the item query length to zero allows
      userspace to query minimum required buffer size.
      
      Enumerated engines have common type mask which can be used to query all
      hardware engines, versus engines userspace can submit to using the execbuf
      uAPI.
      
      Engines also have capabilities which are per engine class namespace of
      bits describing features not present on all engine instances.
      
      v2:
       * Fixed HEVC assignment.
       * Reorder some fields, rename type to flags, increase width. (Lionel)
       * No need to allocate temporary storage if we do it engine by engine.
         (Lionel)
      
      v3:
       * Describe engine flags and mark mbz fields. (Lionel)
       * HEVC only applies to VCS.
      
      v4:
       * Squash SFC flag into main patch.
       * Tidy some comments.
      
      v5:
       * Add uabi_ prefix to engine capabilities. (Chris Wilson)
       * Report exact size of engine info array. (Chris Wilson)
       * Drop the engine flags. (Joonas Lahtinen)
       * Added some more reserved fields.
       * Move flags after class/instance.
      
      v6:
       * Do not check engine info array was zeroed by userspace but zero the
         unused fields for them instead.
      
      v7:
       * Simplify length calculation loop. (Lionel Landwerlin)
      
      v8:
       * Remove MBZ comments where not applicable.
       * Rename ABI flags to match engine class define naming.
       * Rename SFC ABI flag to reflect it applies to VCS and VECS.
       * SFC is wired to even _logical_ engine instances.
       * SFC applies to VCS and VECS.
       * HEVC is present on all instances on Gen11. (Tony)
       * Simplify length calculation even more. (Chris Wilson)
       * Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
       * Use vdbox_sfc_access from runtime info.
       * Rebase for RUNTIME_INFO.
       * Refactor for lower indentation.
       * Rename uAPI class/instance to engine_class/instance to avoid C++
         keyword.
      
      v9:
       * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.
      
      v10:
       * Use new copy_query_item.
      
      v11:
       * Consolidate with struct i915_engine_class_instnace.
      
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jon Bloomfield <jon.bloomfield@intel.com>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tony Ye <tony.ye@intel.com>
      Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com
      c5d3e39c
    • Tvrtko Ursulin's avatar
      drm/i915/icl: Add WaDisableBankHangMode · cbe3e1d1
      Tvrtko Ursulin authored
      Disable GPU hang by default on unrecoverable ECC cache errors.
      
      v2:
       * Rebase.
      
      v3:
       * Use intel_uncore_read. (Chris)
      
      Fixes: cc38cae7
      
       ("drm/i915/icl: Introduce initial Icelake Workarounds")
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190520110442.403-2-tvrtko.ursulin@linux.intel.com
      cbe3e1d1
    • Tvrtko Ursulin's avatar
      drm/i915/selftests: Verify context workarounds · fde93886
      Tvrtko Ursulin authored
      
      
      Test context workarounds have been correctly applied in newly created
      contexts.
      
      To accomplish this the existing engine_wa_list_verify helper is extended
      to take in a context from which reading of the workaround list will be
      done.
      
      Context workaround verification is done from the existing subtests, which
      have been renamed to reflect they are no longer only about GT and engine
      workarounds.
      
      v2:
       * Test after resets and refactor to use intel_context more. (Chris)
      
      v3:
       * Use ce->engine->i915 instead of ce->gem_context->i915. (Chris)
       * gem_engine_iter.idx is engine->id + 1. (Chris)
      
      v4:
       * Make local function static.
      
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190520142546.12493-1-tvrtko.ursulin@linux.intel.com
      fde93886
    • Chris Wilson's avatar
      drm/i915: Allow specification of parallel execbuf · a88b6e4c
      Chris Wilson authored
      
      
      There is a desire to split a task onto two engines and have them run at
      the same time, e.g. scanline interleaving to spread the workload evenly.
      Through the use of the out-fence from the first execbuf, we can
      coordinate secondary execbuf to only become ready simultaneously with
      the first, so that with all things idle the second execbufs are executed
      in parallel with the first. The key difference here between the new
      EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
      waits for the completion of the first request (so that all of its
      rendering results are visible to the second execbuf, the more common
      userspace fence requirement).
      
      Since we only have a single input fence slot, userspace cannot mix an
      in-fence and a submit-fence. It has to use one or the other! This is not
      such a harsh requirement, since by virtue of the submit-fence, the
      secondary execbuf inherit all of the dependencies from the first
      request, and for the application the dependencies should be common
      between the primary and secondary execbuf.
      
      Suggested-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Testcase: igt/gem_exec_fence/parallel
      Link: https://github.com/intel/media-driver/pull/546
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-10-chris@chris-wilson.co.uk
      a88b6e4c
    • Chris Wilson's avatar
      drm/i915/execlists: Virtual engine bonding · ee113690
      Chris Wilson authored
      
      
      Some users require that when a master batch is executed on one particular
      engine, a companion batch is run simultaneously on a specific slave
      engine. For this purpose, we introduce virtual engine bonding, allowing
      maps of master:slaves to be constructed to constrain which physical
      engines a virtual engine may select given a fence on a master engine.
      
      For the moment, we continue to ignore the issue of preemption deferring
      the master request for later. Ideally, we would like to then also remove
      the slave and run something else rather than have it stall the pipeline.
      With load balancing, we should be able to move workload around it, but
      there is a similar stall on the master pipeline while it may wait for
      the slave to be executed. At the cost of more latency for the bonded
      request, it may be interesting to launch both on their engines in
      lockstep. (Bubbles abound.)
      
      Opens: Also what about bonding an engine as its own master? It doesn't
      break anything internally, so allow the silliness.
      
      v2: Emancipate the bonds
      v3: Couple in delayed scheduling for the selftests
      v4: Handle invalid mutually exclusive bonding
      v5: Mention what the uapi does
      v6: s/nbond/num_bonds/
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
      ee113690
    • Chris Wilson's avatar
      drm/i915: Extend execution fence to support a callback · f71e01a7
      Chris Wilson authored
      
      
      In the next patch, we will want to configure the slave request
      depending on which physical engine the master request is executed on.
      For this, we introduce a callback from the execute fence to convey this
      information.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-8-chris@chris-wilson.co.uk
      f71e01a7
    • Chris Wilson's avatar
      drm/i915: Apply an execution_mask to the virtual_engine · 78e41ddd
      Chris Wilson authored
      
      
      Allow the user to direct which physical engines of the virtual engine
      they wish to execute one, as sometimes it is necessary to override the
      load balancing algorithm.
      
      v2: Only kick the virtual engines on context-out if required
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-7-chris@chris-wilson.co.uk
      78e41ddd
    • Chris Wilson's avatar
      drm/i915: Load balancing across a virtual engine · 6d06779e
      Chris Wilson authored
      
      
      Having allowed the user to define a set of engines that they will want
      to only use, we go one step further and allow them to bind those engines
      into a single virtual instance. Submitting a batch to the virtual engine
      will then forward it to any one of the set in a manner as best to
      distribute load.  The virtual engine has a single timeline across all
      engines (it operates as a single queue), so it is not able to concurrently
      run batches across multiple engines by itself; that is left up to the user
      to submit multiple concurrent batches to multiple queues. Multiple users
      will be load balanced across the system.
      
      The mechanism used for load balancing in this patch is a late greedy
      balancer. When a request is ready for execution, it is added to each
      engine's queue, and when an engine is ready for its next request it
      claims it from the virtual engine. The first engine to do so, wins, i.e.
      the request is executed at the earliest opportunity (idle moment) in the
      system.
      
      As not all HW is created equal, the user is still able to skip the
      virtual engine and execute the batch on a specific engine, all within the
      same queue. It will then be executed in order on the correct engine,
      with execution on other virtual engines being moved away due to the load
      detection.
      
      A couple of areas for potential improvement left!
      
      - The virtual engine always take priority over equal-priority tasks.
      Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
      and hopefully the virtual and real engines are not then congested (i.e.
      all work is via virtual engines, or all work is to the real engine).
      
      - We require the breadcrumb irq around every virtual engine request. For
      normal engines, we eliminate the need for the slow round trip via
      interrupt by using the submit fence and queueing in order. For virtual
      engines, we have to allow any job to transfer to a new ring, and cannot
      coalesce the submissions, so require the completion fence instead,
      forcing the persistent use of interrupts.
      
      - We only drip feed single requests through each virtual engine and onto
      the physical engines, even if there was enough work to fill all ELSP,
      leaving small stalls with an idle CS event at the end of every request.
      Could we be greedy and fill both slots? Being lazy is virtuous for load
      distribution on less-than-full workloads though.
      
      Other areas of improvement are more general, such as reducing lock
      contention, reducing dispatch overhead, looking at direct submission
      rather than bouncing around tasklets etc.
      
      sseu: Lift the restriction to allow sseu to be reconfigured on virtual
      engines composed of RENDER_CLASS (rcs).
      
      v2: macroize check_user_mbz()
      v3: Cancel virtual engines on wedging
      v4: Commence commenting
      v5: Replace 64b sibling_mask with a list of class:instance
      v6: Drop the one-element array in the uabi
      v7: Assert it is an virtual engine in to_virtual_engine()
      v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2)
      
      Link: https://github.com/intel/media-driver/pull/283
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
      6d06779e
    • Chris Wilson's avatar
      drm/i915: Allow userspace to clone contexts on creation · b81dde71
      Chris Wilson authored
      
      
      A usecase arose out of handling context recovery in mesa, whereby they
      wish to recreate a context with fresh logical state but preserving all
      other details of the original. Currently, they create a new context and
      iterate over which bits they want to copy across, but it would much more
      convenient if they were able to just pass in a target context to clone
      during creation. This essentially extends the setparam during creation
      to pull the details from a target context instead of the user supplied
      parameters.
      
      The ideal here is that we don't expose control over anything more than
      can be obtained via CONTEXT_PARAM. That is userspace retains explicit
      control over all features, and this api is just convenience.
      
      For example, you could replace
      
      	struct context_param p = { .param = CONTEXT_PARAM_VM };
      
      	param.ctx_id = old_id;
      	gem_context_get_param(&p.param);
      
      	new_id = gem_context_create();
      
      	param.ctx_id = new_id;
      	gem_context_set_param(&p.param);
      
      	gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */
      
      with
      
      	struct create_ext_param p = {
      	  { .name = CONTEXT_CREATE_CLONE },
      	  .clone_id = old_id,
      	  .flags = CLONE_FLAGS_VM
      	}
      	new_id = gem_context_create_ext(&p);
      
      and not have to worry about stray namespace pollution etc.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-5-chris@chris-wilson.co.uk
      b81dde71
    • Chris Wilson's avatar
      drm/i915: Re-expose SINGLE_TIMELINE flags for context creation · 8319f44c
      Chris Wilson authored
      
      
      The SINGLE_TIMELINE flag can be used to create a context such that all
      engine instances within that context share a common timeline. This can
      be useful for mixing operations between real and virtual engines, or
      when using a composite context for a single client API context.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-4-chris@chris-wilson.co.uk
      8319f44c
    • Chris Wilson's avatar
      drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[] · e620f7b3
      Chris Wilson authored
      
      
      Allow the user to specify a local engine index (as opposed to
      class:index) that they can use to refer to a preset engine inside the
      ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES.
      This will be useful for setting SSEU parameters on virtual engines that
      are local to the context and do not have a valid global class:instance
      lookup.
      
      Note that due to the ambiguity in using class:instance with
      ctx->engines[], if a user supplied engine map is active the user must
      specify the engine to alter by its index into the ctx->engines[].
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-3-chris@chris-wilson.co.uk
      e620f7b3
    • Chris Wilson's avatar
      drm/i915: Allow a context to define its set of engines · 976b55f0
      Chris Wilson authored
      
      
      Over the last few years, we have debated how to extend the user API to
      support an increase in the number of engines, that may be sparse and
      even be heterogeneous within a class (not all video decoders created
      equal). We settled on using (class, instance) tuples to identify a
      specific engine, with an API for the user to construct a map of engines
      to capabilities. Into this picture, we then add a challenge of virtual
      engines; one user engine that maps behind the scenes to any number of
      physical engines. To keep it general, we want the user to have full
      control over that mapping. To that end, we allow the user to constrain a
      context to define the set of engines that it can access, order fully
      controlled by the user via (class, instance). With such precise control
      in context setup, we can continue to use the existing execbuf uABI of
      specifying a single index; only now it doesn't automagically map onto
      the engines, it uses the user defined engine map from the context.
      
      v2: Fixup freeing of local on success of get_engines()
      v3: Allow empty engines[]
      v4: s/nengine/num_engines/
      v5: Replace 64 limit on num_engines with a note that execbuf is
      currently limited to only using the first 64 engines.
      v6: Actually use the engines_mutex to guard the ctx->engines.
      
      Testcase: igt/gem_ctx_engines
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-2-chris@chris-wilson.co.uk
      976b55f0