i40e: avoid NVM acquire deadlock during NVM update
X722 devices use the AdminQ to access the NVM, and this requires taking the AdminQ lock. Because of this, we lock the AdminQ during i40e_read_nvm(), which is also called in places where the lock is already held, such as the firmware update path which wants to lock once and then unlock when finished after performing several tasks. Although this should have only affected X722 devices, commit 96a39aed ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) added locking for all NVM reads, regardless of device family. This resulted in us accidentally causing NVM acquire timeouts on all devices, causing failed firmware updates which left the eeprom in a corrupt state. Create unsafe non-locked variants of i40e_read_nvm_word and i40e_read_nvm_buffer, __i40e_read_nvm_word and __i40e_read_nvm_buffer respectively. These variants will not take the NVM lock and are expected to only be called in places where the NVM lock is already held if needed. Since the only caller of i40e_read_nvm_buffer() was in such a path, remove it entirely in favor of the unsafe version. If necessary we can always add it back in the future. Additionally, we now need to hold the NVM lock in i40e_validate_checksum because the call to i40e_calc_nvm_checksum now assumes that the NVM lock is held. We can further move the call to read I40E_SR_SW_CHECKSUM_WORD up a bit so that we do not need to acquire the NVM lock twice. This should resolve firmware updates and also fix potential raise that could have caused the driver to report an invalid NVM checksum upon driver load. Reported-by: Stefan Assmann <sassmann@kpanic.de> Fixes: 96a39aed ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02) Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Please register or sign in to comment