Skip to content
Commit e4e4268d authored by Brian Welty's avatar Brian Welty Committed by Rodrigo Vivi
Browse files

drm/xe: Fix pagefault and access counter worker functions



When processing G2H messages for pagefault or access counters, we queue a
work item and call queue_work(). This fails if the worker thread is already
queued to run.
The expectation is that the worker function will do more than process a
single item and return. It needs to either process all pending items or
requeue itself if items are pending. But requeuing will add latency and
potential context switch can occur.

We don't want to add unnecessary latency and so the worker should process
as many faults as it can within a reasonable duration of time.
We also do not want to hog the cpu core, so here we execute in a loop
and requeue if still running after more than 20 ms.
This seems reasonable framework and easy to tune this futher if needed.

This resolves issues seen with several igt@xe_exec_fault_mode subtests
where the GPU will hang when KMD ignores a pending pagefault.

v2: requeue the worker instead of having an internal processing loop.
v3: implement hybrid model of v1 and v2
    now, run for 20 msec before we will requeue if still running
v4: only requeue in worker if queue is non-empty (Matt B)

Signed-off-by: default avatarBrian Welty <brian.welty@intel.com>
Reviewed-by: default avatarMatthew Brost <matthew.brost@intel.com>
Reviewed-by: default avatarStuart Summers <stuart.summers@intel.com>
Signed-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
parent 57162274
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment