Skip to content
  1. Dec 31, 2020
  2. Dec 30, 2020
  3. Dec 25, 2020
  4. Dec 24, 2020
    • Matthew Auld's avatar
      drm/i915: clear the gpu reloc batch · 26ebc511
      Matthew Auld authored
      
      
      The reloc batch is short lived but can exist in the user visible ppGTT,
      and since it's backed by an internal object, which lacks page clearing,
      we should take care to clear it upfront.
      
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
      Cc: stable@vger.kernel.org
      26ebc511
    • Matthew Auld's avatar
      drm/i915: clear the shadow batch · eeb52ee6
      Matthew Auld authored
      The shadow batch is an internal object, which doesn't have any page
      clearing, and since the batch_len can be smaller than the object, we
      should take care to clear it.
      
      Testcase: igt/gen9_exec_parse/shadow-peek
      Fixes: 4f7af194
      
       ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com
      Cc: stable@vger.kernel.org
      eeb52ee6
    • Chris Wilson's avatar
      drm/i915/gt: ce->inflight updates are now serialised · 177b7a52
      Chris Wilson authored
      
      
      Since schedule-in and schedule-out are now both always under the tasklet
      bitlock, we can reduce the individual atomic operations to simple
      instructions and worry less.
      
      This notably eliminates the race observed with intel_context_inflight in
      __engine_unpark().
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2583
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-9-chris@chris-wilson.co.uk
      177b7a52
    • Chris Wilson's avatar
      drm/i915/gt: Simplify virtual engine handling for execlists_hold() · ac1a6d73
      Chris Wilson authored
      
      
      Now that the tasklet completely controls scheduling of the requests, and
      we postpone scheduling out the old requests, we can keep a hanging
      virtual request bound to the engine on which it hung, and remove it from
      te queue. On release, it will be returned to the same engine and remain
      in its queue until it is scheduled; after which point it will become
      eligible for transfer to a sibling. Instead, we could opt to resubmit the
      request along the virtual engine on unhold, making it eligible for load
      balancing immediately -- but that seems like a pointless optimisation
      for a hanging context.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-8-chris@chris-wilson.co.uk
      ac1a6d73
    • Chris Wilson's avatar
      drm/i915/gt: Resubmit the virtual engine on schedule-out · f81475bb
      Chris Wilson authored
      
      
      Having recognised that we do not change the sibling until we schedule
      out, we can then defer the decision to resubmit the virtual engine from
      the unwind of the active queue to scheduling out of the virtual context.
      This improves our resilence in virtual engine scheduling, and should
      eliminate the rare cases of gem_exec_balance failing.
      
      By keeping the unwind order intact on the local engine, we can preserve
      data dependency ordering while doing a preempt-to-busy pass until we
      have determined the new ELSP. This means that if we try to timeslice
      between a virtual engine and a data-dependent ordinary request, the pair
      will maintain their relative ordering and we will avoid the
      resubmission, cancelling the timeslicing until further change.
      
      The dilemma though is that we then may end up in a situation where the
      'demotion' of the virtual request to an ordinary request in the engine
      queue results in filling the ELSP[] with virtual requests instead of
      spreading the load across the engines. To compensate for this, we mark
      each virtual request and refuse to resubmit a virtual request in the
      secondary ELSP slots, thus forcing subsequent virtual requests to be
      scheduled out after timeslicing. By delaying the decision until we
      schedule out, we will avoid unnecessary resubmission.
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2079
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2098
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-7-chris@chris-wilson.co.uk
      f81475bb
    • Chris Wilson's avatar
      drm/i915/gt: Shrink the critical section for irq signaling · 66e40750
      Chris Wilson authored
      
      
      Let's only wait for the list iterator when decoupling the virtual
      breadcrumb, as the signaling of all the requests may take a long time,
      during which we do not want to keep the tasklet spinning.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-6-chris@chris-wilson.co.uk
      66e40750
    • Chris Wilson's avatar
      drm/i915/gt: Remove virtual breadcrumb before transfer · bab0557c
      Chris Wilson authored
      
      
      The issue with stale virtual breadcrumbs remain. Now we have the problem
      that if the irq-signaler is still referencing the stale breadcrumb as we
      transfer it to a new sibling, the list becomes spaghetti. This is a very
      small window, but that doesn't stop it being hit infrequently. To
      prevent the lists being tangled (the iterator starting on one engine's
      b->signalers but walking onto another list), always decouple the virtual
      breadcrumb on schedule-out and make sure that the walker has stepped out
      of the lists.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-5-chris@chris-wilson.co.uk
      bab0557c
    • Chris Wilson's avatar
      drm/i915/gt: Defer schedule_out until after the next dequeue · 6f0726b4
      Chris Wilson authored
      
      
      Inside schedule_out, we do extra work upon idling the context, such as
      updating the runtime, kicking off retires, kicking virtual engines.
      However, if we are in a series of processing single requests per
      contexts, we may find ourselves scheduling out the context, only to
      immediately schedule it back in during dequeue. This is just extra work
      that we can avoid if we keep the context marked as inflight across the
      dequeue. This becomes more significant later on for minimising virtual
      engine misses.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-4-chris@chris-wilson.co.uk
      6f0726b4
    • Chris Wilson's avatar
      drm/i915/gt: Decouple inflight virtual engines · 2efa2c52
      Chris Wilson authored
      
      
      Once a virtual engine has been bound to a sibling, it will remain bound
      until we finally schedule out the last active request. We can not rebind
      the context to a new sibling while it is inflight as the context save
      will conflict, hence we wait. As we cannot then use any other sibliing
      while the context is inflight, only kick the bound sibling while it
      inflight and upon scheduling out the kick the rest (so that we can swap
      engines on timeslicing if the previously bound engine becomes
      oversubscribed).
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-3-chris@chris-wilson.co.uk
      2efa2c52
    • Chris Wilson's avatar
      drm/i915/gt: Use virtual_engine during execlists_dequeue · 64b7a3fa
      Chris Wilson authored
      
      
      Rather than going back and forth between the rb_node entry and the
      virtual_engine type, store the ve local and reuse it. As the
      container_of conversion from rb_node to virtual_engine requires a
      variable offset, performing that conversion just once shaves off a bit
      of code.
      
      v2: Keep a single virtual engine lookup, for typical use.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-2-chris@chris-wilson.co.uk
      64b7a3fa
    • Chris Wilson's avatar
      drm/i915/gt: Replace direct submit with direct call to tasklet · 16f2941a
      Chris Wilson authored
      Rather than having special case code for opportunistically calling
      process_csb() and performing a direct submit while holding the engine
      spinlock for submitting the request, simply call the tasklet directly.
      This allows us to retain the direct submission path, including the CS
      draining to allow fast/immediate submissions, without requiring any
      duplicated code paths, and most importantly greatly simplifying the
      control flow by removing reentrancy. This will enable us to close a few
      races in the virtual engines in the next few patches.
      
      The trickiest part here is to ensure that paired operations (such as
      schedule_in/schedule_out) remain under consistent locking domains,
      e.g. when pulled outside of the engine->active.lock
      
      v2: Use bh kicking, see commit 3c53776e
      
       ("Mark HI and TASKLET
      softirq synchronous").
      v3: Update engine-reset to be tasklet aware
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
      16f2941a
    • Chris Wilson's avatar
      drm/i915/gem: Optimistically prune dma-resv from the shrinker. · 6d393ef5
      Chris Wilson authored
      
      
      As we shrink an object, also see if we can prune the dma-resv of idle
      fences it is maintaining a reference to.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201223122051.4624-2-chris@chris-wilson.co.uk
      6d393ef5
    • Chris Wilson's avatar
      drm/i915/gt: Prefer recycling an idle fence · d7d82f5d
      Chris Wilson authored
      
      
      If we want to reuse a fence that is in active use by the GPU, we have to
      wait an uncertain amount of time, but if we reuse an inactive fence, we
      can change it right away. Loop through the list of available fences
      twice, ignoring any active fences on the first pass.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201223122051.4624-1-chris@chris-wilson.co.uk
      d7d82f5d
    • Chris Wilson's avatar
      drm/i915/gt: Consolidate the CS timestamp clocks · f170523a
      Chris Wilson authored
      
      
      Pull the GT clock information [used to derive CS timestamps and PM
      interval] under the GT so that is it local to the users. In doing so, we
      consolidate the two references for the same information, of which the
      runtime-info took note of a potential clock source override and scaling
      factors.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-2-chris@chris-wilson.co.uk
      f170523a
    • Chris Wilson's avatar
      drm/i915/selftests: Confirm CS_TIMESTAMP / CTX_TIMESTAMP share a clock · 8391c9b2
      Chris Wilson authored
      
      
      We assume that both timestamps are driven off the same clock [reported
      to userspace as I915_PARAM_CS_TIMESTAMP_FREQUENCY]. Verify that this is
      so by reading the timestamp registers around a busywait (on an otherwise
      idle engine so there should be no preemptions).
      
      v2: Icelake (not ehl, nor tgl) seems to be using a fixed 80ns interval
      for, and only for, CTX_TIMESTAMP -- or it may be GPU frequency and the
      test is always running at maximum frequency?. As far as I can tell, this
      isolated change in behaviour is undocumented.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-1-chris@chris-wilson.co.uk
      8391c9b2
    • Chris Wilson's avatar
      drm/i915/selftests: Remove redundant live_context for eviction · 57f62622
      Chris Wilson authored
      
      
      We just need the context image from the logical state to force eviction
      of many contexts, so simplify by avoiding the GEM context container.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201223154509.14155-1-chris@chris-wilson.co.uk
      57f62622
  5. Dec 23, 2020
    • Chris Wilson's avatar
      drm/i915/uc: Squelch load failure error message · 5be071e9
      Chris Wilson authored
      
      
      The caller determines if the failure is an error or not, so avoid
      warning when we will try again and succeed. For example,
      
      <7> [111.319321] [drm:intel_guc_fw_upload [i915]] GuC status 0x20
      <3> [111.319340] i915 0000:00:02.0: [drm] *ERROR* GuC load failed: status = 0x00000020
      <3> [111.319606] i915 0000:00:02.0: [drm] *ERROR* GuC load failed: status: Reset = 0, BootROM = 0x10, UKernel = 0x00, MIA = 0x00, Auth = 0x00
      <7> [111.320045] [drm:__uc_init_hw [i915]] GuC fw load failed: -110; will reset and retry 2 more time(s)
      <7> [111.322978] [drm:intel_guc_fw_upload [i915]] GuC status 0x8002f0ec
      
      should not have been reported as a _test_ failure, as the GuC was
      successfully loaded on the second attempt and the system remained
      operational.
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2797
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201214100949.11387-2-chris@chris-wilson.co.uk
      5be071e9
    • Chris Wilson's avatar
      drm/i915: Use cmpxchg64 for 32b compatilibity · 5e963508
      Chris Wilson authored
      
      
      By using the double wide cmpxchg64 on 32bit, we can use the same
      algorithm on both 32/64b systems.
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201211110310.22740-1-chris@chris-wilson.co.uk
      5e963508
  6. Dec 22, 2020
  7. Dec 21, 2020
  8. Dec 19, 2020
    • Chris Wilson's avatar
      drm/i915: Check for rq->hwsp validity after acquiring RCU lock · 9bb36cf6
      Chris Wilson authored
      Since we allow removing the timeline map at runtime, there is a risk
      that rq->hwsp points into a stale page. To control that risk, we hold
      the RCU read lock while reading *rq->hwsp, but we missed a couple of
      important barriers. First, the unpinning / removal of the timeline map
      must be after all RCU readers into that map are complete, i.e. after an
      rcu barrier (in this case courtesy of call_rcu()). Secondly, we must
      make sure that the rq->hwsp we are about to dereference under the RCU
      lock is valid. In this case, we make the rq->hwsp pointer safe during
      i915_request_retire() and so we know that rq->hwsp may become invalid
      only after the request has been signaled. Therefore is the request is
      not yet signaled when we acquire rq->hwsp under the RCU, we know that
      rq->hwsp will remain valid for the duration of the RCU read lock.
      
      This is a very small window that may lead to either considering the
      request not completed (causing a delay until the request is checked
      again, any wait for the request is not affected) or dereferencing an
      invalid pointer.
      
      Fixes: 3adac468
      
       ("drm/i915: Introduce concept of per-timeline (context) HWSP")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v5.1+
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201218122421.18344-1-chris@chris-wilson.co.uk
      9bb36cf6
  9. Dec 18, 2020
    • Aditya Swarup's avatar
      drm/i915/tgl: Add bound checks and simplify TGL REVID macros · 0a982c15
      Aditya Swarup authored
      
      
      Add bound checks for TGL REV ID array. Since, there might
      be a possibility of using older kernels on latest platform
      revisions, resulting in out of bounds access for rev ID array.
      In this scenario, use the latest rev ID available and apply
      those WAs.
      
      Also, modify GT macros for TGL rev ID to reuse tgl_revids_get().
      
      Cc: José Roberto de Souza <jose.souza@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Lucas De Marchi <lucas.demarchi@intel.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarAditya Swarup <aditya.swarup@intel.com>
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201203072359.156682-2-aditya.swarup@intel.com
      0a982c15
    • Aditya Swarup's avatar
      drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping · 83dbd74f
      Aditya Swarup authored
      Fix TGL REVID macros to fetch correct display/gt stepping based
      on SOC rev id from INTEL_REVID() macro. Previously, we were just
      returning the first element of the revid array instead of using
      the correct index based on SOC rev id.
      
      Fixes: c33298cb
      
       ("drm/i915/tgl: Fix stepping WA matching")
      Cc: José Roberto de Souza <jose.souza@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Lucas De Marchi <lucas.demarchi@intel.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarAditya Swarup <aditya.swarup@intel.com>
      Reviewed-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201203072359.156682-1-aditya.swarup@intel.com
      83dbd74f
    • Chris Wilson's avatar
      drm/i915/gt: Track the overall awake/busy time · 8c3b1ba0
      Chris Wilson authored
      
      
      Since we wake the GT up before executing a request, and go to sleep as
      soon as it is retired, the GT wake time not only represents how long the
      device is powered up, but also provides a summary, albeit an overestimate,
      of the device runtime (i.e. the rc0 time to compare against rc6 time).
      
      v2: s/busy/awake/
      v3: software-gt-awake-time and I915_PMU_SOFTWARE_GT_AWAKE_TIME
      
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201215154456.13954-1-chris@chris-wilson.co.uk
      8c3b1ba0
    • Chris Wilson's avatar
      drm/i915/gt: Drain the breadcrumbs just once · e3ed90b8
      Chris Wilson authored
      
      
      Matthew Brost pointed out that the while-loop on a shared breadcrumb was
      inherently fraught with danger as it competed with the other users of
      the breadcrumbs. However, in order to completely drain the re-arming irq
      worker, the while-loop is a necessity, despite my optimism that we could
      force cancellation with a couple of irq_work invocations.
      
      Given that we can't merely drop the while-loop, use an activity counter on
      the breadcrumbs to detect when we are parking the breadcrumbs for the
      last time.
      
      Based on a patch by Matthew Brost.
      
      Reported-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Suggested-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Fixes: 9d5612ca
      
       ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Reviewed-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201217091524.10258-1-chris@chris-wilson.co.uk
      e3ed90b8
  10. Dec 17, 2020
  11. Dec 16, 2020