Skip to content
  1. Jan 12, 2023
    • Joel Fernandes (Google)'s avatar
      adreno: Shutdown the GPU properly · e752e545
      Joel Fernandes (Google) authored
      
      
      During kexec on ARM device, we notice that device_shutdown() only calls
      pm_runtime_force_suspend() while shutting down the GPU. This means the GPU
      kthread is still running and further, there maybe active submits.
      
      This causes all kinds of issues during a kexec reboot:
      
      Warning from shutdown path:
      
      [  292.509662] WARNING: CPU: 0 PID: 6304 at [...] adreno_runtime_suspend+0x3c/0x44
      [  292.509863] Hardware name: Google Lazor (rev3 - 8) with LTE (DT)
      [  292.509872] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  292.509881] pc : adreno_runtime_suspend+0x3c/0x44
      [  292.509891] lr : pm_generic_runtime_suspend+0x30/0x44
      [  292.509905] sp : ffffffc014473bf0
      [...]
      [  292.510043] Call trace:
      [  292.510051]  adreno_runtime_suspend+0x3c/0x44
      [  292.510061]  pm_generic_runtime_suspend+0x30/0x44
      [  292.510071]  pm_runtime_force_suspend+0x54/0xc8
      [  292.510081]  adreno_shutdown+0x1c/0x28
      [  292.510090]  platform_shutdown+0x2c/0x38
      [  292.510104]  device_shutdown+0x158/0x210
      [  292.510119]  kernel_restart_prepare+0x40/0x4c
      
      And here from GPU kthread, an SError OOPs:
      
      [  192.648789]  el1h_64_error+0x7c/0x80
      [  192.648812]  el1_interrupt+0x20/0x58
      [  192.648833]  el1h_64_irq_handler+0x18/0x24
      [  192.648854]  el1h_64_irq+0x7c/0x80
      [  192.648873]  local_daif_inherit+0x10/0x18
      [  192.648900]  el1h_64_sync_handler+0x48/0xb4
      [  192.648921]  el1h_64_sync+0x7c/0x80
      [  192.648941]  a6xx_gmu_set_oob+0xbc/0x1fc
      [  192.648968]  a6xx_hw_init+0x44/0xe38
      [  192.648991]  msm_gpu_hw_init+0x48/0x80
      [  192.649013]  msm_gpu_submit+0x5c/0x1a8
      [  192.649034]  msm_job_run+0xb0/0x11c
      [  192.649058]  drm_sched_main+0x170/0x434
      [  192.649086]  kthread+0x134/0x300
      [  192.649114]  ret_from_fork+0x10/0x20
      
      Fix by calling adreno_system_suspend() in the device_shutdown() path.
      
      [ Applied Rob Clark feedback on fixing adreno_unbind() similarly, also
        tested as above. ]
      
      Cc: Rob Clark <robdclark@chromium.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ricardo Ribalda <ribalda@chromium.org>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Reviewed-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Reviewed-by: default avatarRob Clark <robdclark@gmail.com>
      Patchwork: https://patchwork.freedesktop.org/patch/517633/
      Link: https://lore.kernel.org/r/20230109222547.1368644-1-joel@joelfernandes.org
      
      
      Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
      e752e545
  2. Jan 06, 2023
  3. Jan 05, 2023
  4. Jan 04, 2023
    • Kuogee Hsieh's avatar
      drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer · 1cba0d15
      Kuogee Hsieh authored
      There are 3 possible interrupt sources are handled by DP controller,
      HPDstatus, Controller state changes and Aux read/write transaction.
      At every irq, DP controller have to check isr status of every interrupt
      sources and service the interrupt if its isr status bits shows interrupts
      are pending. There is potential race condition may happen at current aux
      isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
      even irq is not for aux read or write transaction. This may cause aux read
      transaction return premature if host aux data read is in the middle of
      waiting for sink to complete transferring data to host while irq happen.
      This will cause host's receiving buffer contains unexpected data. This
      patch fixes this problem by checking aux isr and return immediately at
      aux isr handler if there are no any isr status bits set.
      
      Current there is a bug report regrading eDP edid corruption happen during
      system booting up. After lengthy debugging to found that VIDEO_READY
      interrupt was continuously firing during system booting up which cause
      dp_aux_isr() to complete dp_aux_cmd_fifo_tx() prematurely to retrieve data
      from aux hardware buffer which is not yet contains complete data transfer
      from sink. This cause edid corruption.
      
      Follows are the signature at kernel logs when problem happen,
      EDID has corrupt header
      panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
      
      Changes in v2:
      -- do complete if (ret == IRQ_HANDLED) ay dp-aux_isr()
      -- add more commit text
      
      Changes in v3:
      -- add Stephen suggested
      -- dp_aux_isr() return IRQ_XXX back to caller
      -- dp_ctrl_isr() return IRQ_XXX back to caller
      
      Changes in v4:
      -- split into two patches
      
      Changes in v5:
      -- delete empty line between tags
      
      Changes in v6:
      -- remove extra "that" and fixed line more than 75 char at commit text
      
      Fixes: c943b494
      
       ("drm/msm/dp: add displayPort driver support")
      Signed-off-by: default avatarKuogee Hsieh <quic_khsieh@quicinc.com>
      Tested-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarAbhinav Kumar <quic_abhinavk@quicinc.com>
      Reviewed-by: default avatarDmitry Baryshkov <dmitry.baryshkov@linaro.org>
      Patchwork: https://patchwork.freedesktop.org/patch/516121/
      Link: https://lore.kernel.org/r/1672193785-11003-2-git-send-email-quic_khsieh@quicinc.com
      
      
      Signed-off-by: default avatarAbhinav Kumar <quic_abhinavk@quicinc.com>
      1cba0d15
  5. Dec 28, 2022
  6. Nov 26, 2022
  7. Nov 23, 2022
  8. Nov 04, 2022