- Mar 09, 2022
-
-
Przemyslaw Patynowski authored
[ Upstream commit 605ca7c5 ] Fix driver not freeing VF's traffic irqs, prior to calling pci_disable_msix in iavf_remove. There were possible 2 erroneous states in which, iavf_close would not be called. One erroneous state is fixed by allowing netdev to register, when state is already running. It was possible for VF adapter to enter state loop from running to resetting, where iavf_open would subsequently fail. If user would then unload driver/remove VF pci, iavf_close would not be called, as the netdev was not registered, leaving traffic pcis still allocated. Fixed this by breaking loop, allowing netdev to open device when adapter state is __IAVF_RUNNING and it is not explicitily downed. Other possiblity is entering to iavf_remove from __IAVF_RESETTING state, where iavf_close would not free irqs, but just return 0. Fixed this by checking for last adapter state and then removing irqs. Kernel panic: [ 2773.628585] kernel BUG at drivers/pci/msi.c:375! ... [ 2773.631567] RIP: 0010:free_msi_irqs+0x180/0x1b0 ... [ 2773.640939] Call Trace: [ 2773.641572] pci_disable_msix+0xf7/0x120 [ 2773.642224] iavf_reset_interrupt_capability.part.41+0x15/0x30 [iavf] [ 2773.642897] iavf_remove+0x12e/0x500 [iavf] [ 2773.643578] pci_device_remove+0x3b/0xc0 [ 2773.644266] device_release_driver_internal+0x103/0x1f0 [ 2773.644948] pci_stop_bus_device+0x69/0x90 [ 2773.645576] pci_stop_and_remove_bus_device+0xe/0x20 [ 2773.646215] pci_iov_remove_virtfn+0xba/0x120 [ 2773.646862] sriov_disable+0x2f/0xe0 [ 2773.647531] ice_free_vfs+0x2f8/0x350 [ice] [ 2773.648207] ice_sriov_configure+0x94/0x960 [ice] [ 2773.648883] ? _kstrtoull+0x3b/0x90 [ 2773.649560] sriov_numvfs_store+0x10a/0x190 [ 2773.650249] kernfs_fop_write+0x116/0x190 [ 2773.650948] vfs_write+0xa5/0x1a0 [ 2773.651651] ksys_write+0x4f/0xb0 [ 2773.652358] do_syscall_64+0x5b/0x1a0 [ 2773.653075] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 22ead37f ("i40evf: Add longer wait after remove module") Signed-off-by:
Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by:
Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Karen Sornek authored
[ Upstream commit 247aa001 ] Add helper function to go from pci_dev to adapter to make work simple - to go from a pci_dev to the adapter structure and make netdev assignment instead of having to go to the net_device then the adapter. Signed-off-by:
Brett Creeley <brett.creeley@intel.com> Signed-off-by:
Karen Sornek <karen.sornek@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Slawomir Laba authored
[ Upstream commit fc2e6b3b ] The driver used to crash in multiple spots when put to stress testing of the init, reset and remove paths. The user would experience call traces or hangs when creating, resetting, removing VFs. Depending on the machines, the call traces are happening in random spots, like reset restoring resources racing with driver remove. Make adapter->crit_lock mutex a mandatory lock for guarding the operations performed on all workqueues and functions dealing with resource allocation and disposal. Make __IAVF_REMOVE a final state of the driver respected by workqueues that shall not requeue, when they fail to obtain the crit_lock. Make the IRQ handler not to queue the new work for adminq_task when the __IAVF_REMOVE state is set. Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections") Signed-off-by:
Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by:
Phani Burra <phani.r.burra@intel.com> Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Signed-off-by:
Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jedrzej Jagielski authored
[ Upstream commit bdb9e5c7 ] Add kernel trace that device was removed. Currently there is no such information. I.e. Host admin removes a PCI device from a VM, than on VM shall be info about the event. This patch adds info log to iavf_remove function. Signed-off-by:
Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by:
Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mateusz Palczewski authored
[ Upstream commit 898ef1cb ] Use single state machine for driver initialization and for service initialized driver. The init state machine implemented in init_task() is merged into the watchdog_task(). The init_task() function is removed. Signed-off-by:
Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by:
Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by:
Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mateusz Palczewski authored
[ Upstream commit 59756ad6 ] This commit adds a new state, __IAVF_INIT_FAILED to the state machine. From now on initialization functions report errors not by returning an error value, but by changing the state to indicate that something went wrong. Signed-off-by:
Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by:
Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by:
Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mateusz Palczewski authored
[ Upstream commit 45eebd62 ] Replace state changes of iavf state machine with a method that also tracks the previous state the machine was on. This change is required for further work with refactoring init and watchdog state machines. Tracking of previous state would help us recover iavf after failure has occurred. Signed-off-by:
Jakub Pawlak <jakub.pawlak@intel.com> Signed-off-by:
Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by:
Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Casper Andersson authored
[ Upstream commit b3a34dc3 ] Check if operation is valid before changing any settings in hardware. Otherwise it results in changes being made despite it not being a valid operation. Fixes: 78eab33b ("net: sparx5: add vlan support") Signed-off-by:
Casper Andersson <casper.casan@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jia-Ju Bai authored
[ Upstream commit 767b9825 ] The function pci_find_capability() in t3_prep_adapter() can fail, so its return value should be checked. Fixes: 4d22de3e ("Add support for the latest 1G/10G Chelsio adapter, T3") Reported-by:
TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by:
Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sukadev Bhattiprolu authored
[ Upstream commit 36491f2d ] If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a8 ("ibmvnic: Fix completion structure initialization") Signed-off-by:
Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sukadev Bhattiprolu authored
[ Upstream commit 83da53f7 ] Define and use a helper to flush the reset queue. Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery") Signed-off-by:
Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sukadev Bhattiprolu authored
[ Upstream commit 765559b1 ] We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0c ("ibmvnic delay complete()") Signed-off-by:
Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Vincent Whitchurch authored
[ Upstream commit 087a7b94 ] In this driver's ->ndo_open() callback, it enables DMA interrupts, starts the DMA channels, then requests interrupts with request_irq(), and then finally enables napi. If RX DMA interrupts are received before napi is enabled, no processing is done because napi_schedule_prep() will return false. If the network has a lot of broadcast/multicast traffic, then the RX ring could fill up completely before napi is enabled. When this happens, no further RX interrupts will be delivered, and the driver will fail to receive any packets. Fix this by only enabling DMA interrupts after all other initialization is complete. Fixes: 523f11b5 ("net: stmmac: move hardware setup for stmmac_open to new function") Reported-by:
Lars Persson <larper@axis.com> Signed-off-by:
Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ong Boon Leong authored
[ Upstream commit ac746c85 ] The previous stmmac_xdp_set_prog() implementation uses stmmac_release() and stmmac_open() which tear down the PHY device and causes undesirable autonegotiation which causes a delay whenever AFXDP ZC is setup. This patch introduces two new functions that just sufficiently tear down DMA descriptors, buffer, NAPI process, and IRQs and reestablish them accordingly in both stmmac_xdp_release() and stammac_xdp_open(). As the results of this enhancement, we get rid of transient state introduced by the link auto-negotiation: $ ./xdpsock -i eth0 -t -z sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 634444 634560 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632330 1267072 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632438 1899584 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632502 2532160 Reported-by:
Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by:
Ong Boon Leong <boon.leong.ong@intel.com> Tested-by:
Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Vincent Mailhol authored
[ Upstream commit f4896248 ] The driver uses an atomic_t variable: struct es58x_device::opened_channel_cnt to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. While the intent was to prevent race conditions, the choice of an atomic_t turns out to be a bad idea for several reasons: - implementation is incorrect and fails to decrement opened_channel_cnt when the URB allocation fails as reported in [1]. - even if opened_channel_cnt were to be correctly decremented, atomic_t is insufficient to cover edge cases: there can be a race condition in which 1/ a first process fails to allocate URBs memory 2/ a second process enters es58x_open() before the first process does its cleanup and decrements opened_channed_cnt. In which case, the second process would successfully return despite the URBs memory not being allocated. - actually, any kind of locking mechanism was useless here because it is redundant with the network stack big kernel lock (a.k.a. rtnl_lock) which is being hold by all the callers of net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many() [3]. The atmomic_t is thus replaced by a simple u8 type and the logic to increment and decrement es58x_device:opened_channel_cnt is simplified accordingly fixing the bug reported in [1]. We do not check again for ASSERST_RTNL() as this is already done by the callers. [1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u [2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463 [3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541 Fixes: 85372578 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Link: https://lore.kernel.org/all/20220212112713.577957-1-mailhol.vincent@wanadoo.fr Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by:
Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Thierry Reding authored
[ Upstream commit 8d3b01e0 ] Move the eDP panel on Venice 2 and Nyan boards into the corresponding AUX bus device tree node. This allows us to avoid a nasty circular dependency that would otherwise be created between the DPAUX and panel nodes via the DDC/I2C phandle. Fixes: eb481f9a ("ARM: tegra: add Acer Chromebook 13 device tree") Fixes: 59fe02cb ("ARM: tegra: Add DTS for the nyan-blaze board") Fixes: 40e231c7 ("ARM: tegra: Enable eDP for Venice2") Signed-off-by:
Thierry Reding <treding@nvidia.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit ae089831 ] While kfree_rcu(ptr) _is_ supported, it has some limitations. Given that 99.99% of kfree_rcu() users [1] use the legacy two parameters variant, and @catchall objects do have an rcu head, simply use it. Choice of kfree_rcu(ptr) variant was probably not intentional. [1] including calls from net/netfilter/nf_tables_api.c Fixes: aaa31047 ("netfilter: nftables: add catch-all set element support") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reviewed-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
蒋家盛 authored
[ Upstream commit a222fd85 ] As the possible failure of the ioremap(), the par_io could be NULL. Therefore it should be better to check it and return error in order to guarantee the success of the initiation. But, I also notice that all the caller like mpc85xx_qe_par_io_init() in `arch/powerpc/platforms/85xx/common.c` don't check the return value of the par_io_init(). Actually, par_io_init() needs to check to handle the potential error. I will submit another patch to fix that. Anyway, par_io_init() itsely should be fixed. Fixes: 7aa1aa6e ("QE: Move QE from arch/powerpc to drivers/soc") Signed-off-by:
Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by:
Li Yang <leoyang.li@nxp.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Christophe JAILLET authored
[ Upstream commit b9abe942 ] If 'devm_kstrdup()' fails, we should return -ENOMEM. While at it, move the 'of_node_put()' call in the error handling path and after the 'machine' has been copied. Better safe than sorry. Fixes: a6fc3b69 ("soc: fsl: add GUTS driver for QorIQ platforms") Depends-on: fddacc7ff4dd ("soc: fsl: guts: Revert commit 3c0d64e8 ") Suggested-by:
Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by:
Li Yang <leoyang.li@nxp.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Christophe JAILLET authored
[ Upstream commit b113737c ] This reverts commit 3c0d64e8 ("soc: fsl: guts: reuse machine name from device tree"). A following patch will fix the missing memory allocation failure check instead. Suggested-by:
Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by:
Li Yang <leoyang.li@nxp.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Anthoine Bourgeois authored
[ Upstream commit 8840f546 ] Devkit8000 board seems to always used 32k_counter as clocksource. Restore this behavior. If clocksource is back to 32k_counter, timer12 is now the clockevent source (as before) and timer2 is not longer needed here. This commit fixes the same issue observed with commit 23885389 ("ARM: dts: Fix timer regression for beagleboard revision c") when sleep is blocked until hitting keys over serial console. Fixes: aba1ad05 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support") Fixes: e428e250 ("ARM: dts: Configure system timers for omap3") Signed-off-by:
Anthoine Bourgeois <anthoine.bourgeois@gmail.com> Signed-off-by:
Tony Lindgren <tony@atomide.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Anthoine Bourgeois authored
[ Upstream commit 64324ef3 ] This patch allow lcd43 and lcd70 flavors to benefit from timer evolution. Fixes: e428e250 ("ARM: dts: Configure system timers for omap3") Signed-off-by:
Anthoine Bourgeois <anthoine.bourgeois@gmail.com> Signed-off-by:
Tony Lindgren <tony@atomide.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chuanhong Guo authored
[ Upstream commit cc19db8b ] It's reported that current memory detection code occasionally detects larger memory under some bootloaders. Current memory detection code tests whether address space wraps around on KSEG0, which is unreliable because it's cached. Rewrite memory size detection to perform the same test on KSEG1 instead. While at it, this patch also does the following two things: 1. use a fixed pattern instead of a random function pointer as the magic value. 2. add an additional memory write and a second comparison as part of the test to prevent possible smaller memory detection result due to leftover values in memory. Fixes: 139c949f MIPS: ("ralink: mt7621: add memory detection support") Reported-by:
Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by:
Chuanhong Guo <gch981213@gmail.com> Tested-by:
Sergio Paracuellos <sergio.paracuellos@gmail.com> Tested-by:
Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Suravee Suthikulpanit authored
[ Upstream commit 6b0b2d9a ] The current logic updates the I/O page table mode for the domain before calling the logic to free memory used for the page table. This results in IOMMU page table memory leak, and can be observed when launching VM w/ pass-through devices. Fix by freeing the memory used for page table before updating the mode. Cc: Joerg Roedel <joro@8bytes.org> Reported-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Tested-by:
Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Fixes: e42ba063 ("iommu/amd: Restructure code for freeing page table") Link: https://lore.kernel.org/all/20220118194720.urjgi73b7c3tq2o6@oracle.com/ Link: https://lore.kernel.org/r/20220210154745.11524-1-suravee.suthikulpanit@amd.com Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Robin Murphy authored
[ Upstream commit 31eeb6b0 ] Although it is painstakingly honest to describe all 3 PCI windows in "dma-ranges", it misses the the subtle distinction that the window for the GICv2m range is normally programmed for Device memory attributes rather than Normal Cacheable like the DRAM windows. Since MMU-401 only offers stage 2 translation, this means that when the PCI SMMU is enabled, accesses through that IPA range unexpectedly lose coherency if mapped as cacheable at the SMMU, due to the attribute combining rules. Since an extra 256KB is neither here nor there when we still have 10GB worth of usable address space, rather than attempting to describe and cope with this detail let's just remove the offending range. If the SMMU is not used then it makes no difference anyway. Link: https://lore.kernel.org/r/856c3f7192c6c3ce545ba67462f2ce9c86ed6b0c.1643046936.git.robin.murphy@arm.com Fixes: 4ac4d146 ("arm64: dts: juno: Describe PCI dma-ranges") Reported-by:
Anders Roxell <anders.roxell@linaro.org> Acked-by:
Liviu Dudau <liviu.dudau@arm.com> Signed-off-by:
Robin Murphy <robin.murphy@arm.com> Signed-off-by:
Sudeep Holla <sudeep.holla@arm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Peter Zijlstra authored
commit b1e82065 upstream. Where commit 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") fixed a fork race vs cgroup, it opened up a race vs syscalls by not placing the task on the runqueue before it gets exposed through the pidhash. Commit 13765de8 ("sched/fair: Fix fault in reweight_entity") is trying to fix a single instance of this, instead fix the whole class of issues, effectively reverting this commit. Fixes: 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") Reported-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by:
Tadeusz Struk <tadeusz.struk@linaro.org> Tested-by:
Zhang Qiao <zhangqiao22@huawei.com> Tested-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Heiko Carstens authored
commit c194dad2 upstream. s390 has a swap_ex_entry_fixup function, however it is not being used since common code expects a swap_ex_entry_fixup define. If it is not defined the default implementation will be used. So fix this by adding a proper define. However also the implementation of the function must be fixed, since a NULL value for handler has a special meaning and must not be adjusted. Luckily all of this doesn't fix a real bug currently: the main extable is correctly sorted during build time, and for runtime sorting there is currently no case where the handler field is not NULL. Fixes: 05a68e89 ("s390/kernel: expand exception table logic to allow new handling options") Acked-by:
Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hugh Dickins authored
commit f2b277c4 upstream. Wangyong reports: after enabling tmpfs filesystem to support transparent hugepage with the following command: echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled the docker program tries to add F_SEAL_WRITE through the following command, but it fails unexpectedly with errno EBUSY: fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1. That is because memfd_tag_pins() and memfd_wait_for_pins() were never updated for shmem huge pages: checking page_mapcount() against page_count() is hopeless on THP subpages - they need to check total_mapcount() against page_count() on THP heads only. Make memfd_tag_pins() (compared > 1) as strict as memfd_wait_for_pins() (compared != 1): either can be justified, but given the non-atomic total_mapcount() calculation, it is better now to be strict. Bear in mind that total_mapcount() itself scans all of the THP subpages, when choosing to take an XA_CHECK_SCHED latency break. Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a page has been swapped out since memfd_tag_pins(), then its refcount must have fallen, and so it can safely be untagged. Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.com Signed-off-by:
Hugh Dickins <hughd@google.com> Reported-by:
Zeal Robot <zealci@zte.com.cn> Reported-by:
wangyong <wang.yong12@zte.com.cn> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: CGEL ZTE <cgel.zte@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Song Liu <songliubraving@fb.com> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sukadev Bhattiprolu authored
commit 8d0657f3 upstream. Fix a tiny memory leak when flushing the reset work queue. Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery") Signed-off-by:
Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sasha Neftin authored
commit c4208653 upstream. Similar to "igc_read_phy_reg_gpy: drop premature return" patch. igc_write_phy_reg_gpy checks the return value from igc_write_phy_reg_mdic and if it's not 0, returns immediately. By doing this, it leaves the HW semaphore in the acquired state. Drop this premature return statement, the function returns after releasing the semaphore immediately anyway. Fixes: 5586838f ("igc: Add code for PHY support") Suggested-by:
Dima Ruinskiy <dima.ruinskiy@intel.com> Reported-by:
Corinna Vinschen <vinschen@redhat.com> Signed-off-by:
Sasha Neftin <sasha.neftin@intel.com> Tested-by:
Naama Meir <naamax.meir@linux.intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Samuel Holland authored
commit bac129db upstream. This driver, like several others, uses a chained IRQ for each GPIO bank, and forwards .irq_set_wake to the GPIO bank's upstream IRQ. As a result, a call to irq_set_irq_wake() needs to lock both the upstream and downstream irq_desc's. Lockdep considers this to be a possible deadlock when the irq_desc's share lockdep classes, which they do by default: ============================================ WARNING: possible recursive locking detected 5.17.0-rc3-00394-gc849047c2473 #1 Not tainted -------------------------------------------- init/307 is trying to acquire lock: c2dfe27c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0 but task is already holding lock: c3c0ac7c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&irq_desc_lock_class); lock(&irq_desc_lock_class); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by init/307: #0: c1f29f18 (system_transition_mutex){+.+.}-{3:3}, at: __do_sys_reboot+0x90/0x23c #1: c20f7760 (&dev->mutex){....}-{3:3}, at: device_shutdown+0xf4/0x224 #2: c2e804d8 (&dev->mutex){....}-{3:3}, at: device_shutdown+0x104/0x224 #3: c3c0ac7c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0x58/0xa0 stack backtrace: CPU: 0 PID: 307 Comm: init Not tainted 5.17.0-rc3-00394-gc849047c2473 #1 Hardware name: Allwinner sun8i Family unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x90 dump_stack_lvl from __lock_acquire+0x1680/0x31a0 __lock_acquire from lock_acquire+0x148/0x3dc lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c _raw_spin_lock_irqsave from __irq_get_desc_lock+0x58/0xa0 __irq_get_desc_lock from irq_set_irq_wake+0x2c/0x19c irq_set_irq_wake from irq_set_irq_wake+0x13c/0x19c [tail call from sunxi_pinctrl_irq_set_wake] irq_set_irq_wake from gpio_keys_suspend+0x80/0x1a4 gpio_keys_suspend from gpio_keys_shutdown+0x10/0x2c gpio_keys_shutdown from device_shutdown+0x180/0x224 device_shutdown from __do_sys_reboot+0x134/0x23c __do_sys_reboot from ret_fast_syscall+0x0/0x1c However, this can never deadlock because the upstream and downstream IRQs are never the same (nor do they even involve the same irqchip). Silence this erroneous lockdep splat by applying what appears to be the usual fix of moving the GPIO IRQs to separate lockdep classes. Fixes: a59c99d9 ("pinctrl: sunxi: Forward calls to irq_set_irq_wake") Reported-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Samuel Holland <samuel@sholland.org> Reviewed-by:
Jernej Skrabec <jernej.skrabec@gmail.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20220216040037.22730-1-samuel@sholland.org Signed-off-by:
Linus Walleij <linus.walleij@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Amit Cohen authored
commit dc975207 upstream. The test adds tc filters and checks how many of them were offloaded by grepping for 'in_hw'. iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control action offload") added offload indication to tc actions, producing the following output: $ tc filter show dev swp2 ingress ... filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0 eth_type ipv6 dst_ip 2001:db8:1::7bf skip_sw in_hw in_hw_count 1 action order 1: police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b ref 1 bind 1 not_in_hw used_hw_stats immediate The current grep expression matches on both 'in_hw' and 'not_in_hw', resulting in incorrect results. Fix that by using JSON output instead. Fixes: 5061e773 ("selftests: mlxsw: Add scale test for tc-police") Signed-off-by:
Amit Cohen <amcohen@nvidia.com> Reviewed-by:
Petr Machata <petrm@nvidia.com> Signed-off-by:
Ido Schimmel <idosch@nvidia.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mat Martineau authored
commit 877d11f0 upstream. Syzkaller with UBSAN uncovered a scenario where a large number of DATA_FIN retransmits caused a shift-out-of-bounds in the DATA_FIN timeout calculation: ================================================================================ UBSAN: shift-out-of-bounds in net/mptcp/protocol.c:470:29 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 1 PID: 13059 Comm: kworker/1:0 Not tainted 5.17.0-rc2-00630-g5fbf21c90c60 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: events mptcp_worker Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 ubsan_epilogue+0xb/0x5a lib/ubsan.c:151 __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e lib/ubsan.c:330 mptcp_set_datafin_timeout net/mptcp/protocol.c:470 [inline] __mptcp_retrans.cold+0x72/0x77 net/mptcp/protocol.c:2445 mptcp_worker+0x58a/0xa70 net/mptcp/protocol.c:2528 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2307 worker_thread+0x95/0xe10 kernel/workqueue.c:2454 kthread+0x2f4/0x3b0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> ================================================================================ This change limits the maximum timeout by limiting the size of the shift, which keeps all intermediate values in-bounds. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/259 Fixes: 6477dd39 ("mptcp: Retransmit DATA_FIN") Acked-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
commit 7b83299e upstream. early_param() handlers should return 0 on success. __setup() handlers should return 1 on success, i.e., the parameter has been handled. A return of 0 would cause the "option=value" string to be added to init's environment strings, polluting it. ../arch/arm/mm/mmu.c: In function 'test_early_cachepolicy': ../arch/arm/mm/mmu.c:215:1: error: no return statement in function returning non-void [-Werror=return-type] ../arch/arm/mm/mmu.c: In function 'test_noalign_setup': ../arch/arm/mm/mmu.c:221:1: error: no return statement in function returning non-void [-Werror=return-type] Fixes: b849a60e ("ARM: make cr_alignment read-only #ifndef CONFIG_CPU_CP15") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Reported-by:
Igor Zhbanov <i.zhbanov@omprussia.ru> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: patches@armlinux.org.uk Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
commit 1e6ae0e4 upstream. Correct a typo/pasto: setnocoherentio() should set dma_default_coherent to false, not true. Fixes: 14ac09a6 ("MIPS: refactor the runtime coherent vs noncoherent DMA indicators") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Russell King (Oracle) authored
commit d920eaa4 upstream. The kgdb code needs to register an undef hook for the Thumb UDF instruction that will fault in order to be functional on Thumb2 platforms. Reported-by:
Johannes Stezenbach <js@sig21.net> Tested-by:
Johannes Stezenbach <js@sig21.net> Fixes: 5cbad0eb ("kgdb: support for ARCH=arm") Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Corinna Vinschen authored
commit fda26354 upstream. igc_read_phy_reg_gpy checks the return value from igc_read_phy_reg_mdic and if it's not 0, returns immediately. By doing this, it leaves the HW semaphore in the acquired state. Drop this premature return statement, the function returns after releasing the semaphore immediately anyway. Fixes: 5586838f ("igc: Add code for PHY support") Signed-off-by:
Corinna Vinschen <vinschen@redhat.com> Acked-by:
Sasha Neftin <sasha.neftin@intel.com> Tested-by:
Naama Meir <naamax.meir@linux.intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Brian Norris authored
commit b5fbaf7d upstream. Commit b18c6c3c ("ASoC: rockchip: cdn-dp sound output use spdif") switched the platform to SPDIF, but we didn't fix up the device tree. Drop the pinctrl settings, because the 'spdif_bus' pins are either: * unused (on kevin, bob), so the settings is ~harmless * used by a different function (on scarlet), which causes probe failures (!!) Fixes: b18c6c3c ("ASoC: rockchip: cdn-dp sound output use spdif") Signed-off-by:
Brian Norris <briannorris@chromium.org> Reviewed-by:
Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220114150129.v2.1.I46f64b00508d9dff34abe1c3e8d2defdab4ea1e5@changeid Signed-off-by:
Heiko Stuebner <heiko@sntech.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miaoqian Lin authored
commit 9826e393 upstream. The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the error handling path. Fixes: 765a9d1d ("iommu/tegra-smmu: Fix mc errors on tegra124-nyan") Signed-off-by:
Miaoqian Lin <linmq006@gmail.com> Acked-by:
Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20220107080915.12686-1-linmq006@gmail.com Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vincent Mailhol authored
commit 035b0fcf upstream. The driver uses an atomic_t variable: gs_usb:active_channels to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. However, the driver does not decrement the counter when an error occurs in gs_can_open(). This issue is fixed by changing the type from atomic_t to u8 and by simplifying the logic accordingly. It is safe to use an u8 here because the network stack big kernel lock (a.k.a. rtnl_mutex) is being hold. For details, please refer to [1]. [1] https://lore.kernel.org/linux-can/CAMZ6Rq+sHpiw34ijPsmp7vbUpDtJwvVtdV7CvRZJsLixjAFfrg@mail.gmail.com/T/#t Fixes: d08e973a ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://lore.kernel.org/all/20220214234814.1321599-1-mailhol.vincent@wanadoo.fr Signed-off-by:
Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by:
Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-