- Apr 01, 2020
-
-
Nathan Chancellor authored
[ Upstream commit 7395f62d ] Clang warns: drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:2860:9: warning: converting the result of '?:' with integer constants to a boolean always evaluates to 'true' [-Wtautological-constant-compare] return DPAA_FD_DATA_ALIGNMENT ? ALIGN(headroom, ^ drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:131:34: note: expanded from macro 'DPAA_FD_DATA_ALIGNMENT' \#define DPAA_FD_DATA_ALIGNMENT (fman_has_errata_a050385() ? 64 : 16) ^ 1 warning generated. This was exposed by commit 3c68b8ff ("dpaa_eth: FMan erratum A050385 workaround") even though it appears to have been an issue since the introductory commit 9ad1a374 ("dpaa_eth: add support for DPAA Ethernet") since DPAA_FD_DATA_ALIGNMENT has never been able to be zero. Just replace the whole boolean expression with the true branch, as it is always been true. Link: https://github.com/ClangBuiltLinux/linux/issues/928 Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Reviewed-by:
Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Nicolas Cavallari authored
[ Upstream commit ba32679c ] When trying to transmit to an unknown destination, the mesh code would unconditionally transmit a HWMP PREQ even if HWMP is not the current path selection algorithm. Signed-off-by:
Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Link: https://lore.kernel.org/r/20200305140409.12204-1-cavallar@lri.fr Signed-off-by:
Johannes Berg <johannes.berg@intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Wen Xiong authored
[ Upstream commit 394b6171 ] When trying to rescan disks in petitboot shell, we hit the following softlockup stacktrace: Kernel panic - not syncing: System is deadlocked on memory [ 241.223394] CPU: 32 PID: 693 Comm: sh Not tainted 5.4.16-openpower1 #1 [ 241.223406] Call Trace: [ 241.223415] [c0000003f07c3180] [c000000000493fc4] dump_stack+0xa4/0xd8 (unreliable) [ 241.223432] [c0000003f07c31c0] [c00000000007d4ac] panic+0x148/0x3cc [ 241.223446] [c0000003f07c3260] [c000000000114b10] out_of_memory+0x468/0x4c4 [ 241.223461] [c0000003f07c3300] [c0000000001472b0] __alloc_pages_slowpath+0x594/0x6d8 [ 241.223476] [c0000003f07c3420] [c00000000014757c] __alloc_pages_nodemask+0x188/0x1a4 [ 241.223492] [c0000003f07c34a0] [c000000000153e10] alloc_pages_current+0xcc/0xd8 [ 241.223508] [c0000003f07c34e0] [c0000000001577ac] alloc_slab_page+0x30/0x98 [ 241.223524] [c0000003f07c3520] [c0000000001597fc] new_slab+0x138/0x40c [ 241.223538] [c0000003f07c35f0] [c00000000015b204] ___slab_alloc+0x1e4/0x404 [ 241.223552] [c0000003f07c36c0] [c00000000015b450] __slab_alloc+0x2c/0x48 [ 241.223566] [c0000003f07c36f0] [c00000000015b754] kmem_cache_alloc_node+0x9c/0x1b4 [ 241.223582] [c0000003f07c3760] [c000000000218c48] blk_alloc_queue_node+0x34/0x270 [ 241.223599] [c0000003f07c37b0] [c000000000226574] blk_mq_init_queue+0x2c/0x78 [ 241.223615] [c0000003f07c37e0] [c0000000002ff710] scsi_mq_alloc_queue+0x28/0x70 [ 241.223631] [c0000003f07c3810] [c0000000003005b8] scsi_alloc_sdev+0x184/0x264 [ 241.223647] [c0000003f07c38a0] [c000000000300ba0] scsi_probe_and_add_lun+0x288/0xa3c [ 241.223663] [c0000003f07c3a00] [c000000000301768] __scsi_scan_target+0xcc/0x478 [ 241.223679] [c0000003f07c3b20] [c000000000301c64] scsi_scan_channel.part.9+0x74/0x7c [ 241.223696] [c0000003f07c3b70] [c000000000301df4] scsi_scan_host_selected+0xe0/0x158 [ 241.223712] [c0000003f07c3bd0] [c000000000303f04] store_scan+0x104/0x114 [ 241.223727] [c0000003f07c3cb0] [c0000000002d5ac4] dev_attr_store+0x30/0x4c [ 241.223741] [c0000003f07c3cd0] [c0000000001dbc34] sysfs_kf_write+0x64/0x78 [ 241.223756] [c0000003f07c3cf0] [c0000000001da858] kernfs_fop_write+0x170/0x1b8 [ 241.223773] [c0000003f07c3d40] [c0000000001621fc] __vfs_write+0x34/0x60 [ 241.223787] [c0000003f07c3d60] [c000000000163c2c] vfs_write+0xa8/0xcc [ 241.223802] [c0000003f07c3db0] [c000000000163df4] ksys_write+0x70/0xbc [ 241.223816] [c0000003f07c3e20] [c00000000000b40c] system_call+0x5c/0x68 As a part of the scan process Linux will allocate and configure a scsi_device for each target to be scanned. If the device is not present, then the scsi_device is torn down. As a part of scsi_device teardown a workqueue item will be scheduled and the lockups we see are because there are 250k workqueue items to be processed. Accoding to the specification of SIS-64 sas controller, max_channel should be decreased on SIS-64 adapters to 4. The patch fixes softlockup issue. Thanks for Oliver Halloran's help with debugging and explanation! Link: https://lore.kernel.org/r/1583510248-23672-1-git-send-email-wenxiong@linux.vnet.ibm.com Signed-off-by:
Wen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Juliet Kim authored
[ Upstream commit 7d7195a0 ] The ibmvnic driver does not check the device state when the device is removed. If the device is removed while a device reset is being processed, the remove may free structures needed by the reset, causing an oops. Fix this by checking the device state before processing device remove. Signed-off-by:
Juliet Kim <julietk@linux.vnet.ibm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Qian Cai authored
[ Upstream commit f5152416 ] Similar to the commit 02d715b4 ("iommu/vt-d: Fix RCU list debugging warnings"), there are several other places that call list_for_each_entry_rcu() outside of an RCU read side critical section but with dmar_global_lock held. Silence those false positives as well. drivers/iommu/intel-iommu.c:4288 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1ad/0xb97 drivers/iommu/dmar.c:366 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x125/0xb97 drivers/iommu/intel-iommu.c:5057 RCU-list traversed in non-reader section!! 1 lock held by swapper/0/1: #0: ffffffffa71892c8 (dmar_global_lock){++++}, at: intel_iommu_init+0x61a/0xb13 Signed-off-by:
Qian Cai <cai@lca.pw> Acked-by:
Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Marek Szyprowski authored
[ Upstream commit 07dc3678 ] Store the IOMMU mapping created by the device core of each Exynos DRM sub-device and restore it when the Exynos DRM driver is unbound. This fixes IOMMU initialization failure for the second time when a deferred probe is triggered from the bind() callback of master's compound DRM driver. This also fixes the following issue found using kmemleak detector: unreferenced object 0xc2137640 (size 64): comm "swapper/0", pid 1, jiffies 4294937900 (age 3127.400s) hex dump (first 32 bytes): 50 a3 14 c2 80 a2 14 c2 01 00 00 00 20 00 00 00 P........... ... 00 10 00 00 00 80 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<3acd268d>] arch_setup_dma_ops+0x4c/0x104 [<9f7d2cce>] of_dma_configure+0x19c/0x3a4 [<ba07704b>] really_probe+0xb0/0x47c [<4f510e4f>] driver_probe_device+0x78/0x1c4 [<7481a0cf>] device_driver_attach+0x58/0x60 [<0ff8f5c1>] __driver_attach+0xb8/0x158 [<86006144>] bus_for_each_dev+0x74/0xb4 [<10159dca>] bus_add_driver+0x1c0/0x200 [<8a265265>] driver_register+0x74/0x108 [<e0f3451a>] exynos_drm_init+0xb0/0x134 [<db3fc7ba>] do_one_initcall+0x90/0x458 [<6da35917>] kernel_init_freeable+0x188/0x200 [<db3f74d4>] kernel_init+0x8/0x110 [<1f3cddf9>] ret_from_fork+0x14/0x20 [<8cd12507>] 0x0 unreferenced object 0xc214a280 (size 128): comm "swapper/0", pid 1, jiffies 4294937900 (age 3127.400s) hex dump (first 32 bytes): 00 a0 ec ed 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<3acd268d>] arch_setup_dma_ops+0x4c/0x104 [<9f7d2cce>] of_dma_configure+0x19c/0x3a4 [<ba07704b>] really_probe+0xb0/0x47c [<4f510e4f>] driver_probe_device+0x78/0x1c4 [<7481a0cf>] device_driver_attach+0x58/0x60 [<0ff8f5c1>] __driver_attach+0xb8/0x158 [<86006144>] bus_for_each_dev+0x74/0xb4 [<10159dca>] bus_add_driver+0x1c0/0x200 [<8a265265>] driver_register+0x74/0x108 [<e0f3451a>] exynos_drm_init+0xb0/0x134 [<db3fc7ba>] do_one_initcall+0x90/0x458 [<6da35917>] kernel_init_freeable+0x188/0x200 [<db3f74d4>] kernel_init+0x8/0x110 [<1f3cddf9>] ret_from_fork+0x14/0x20 [<8cd12507>] 0x0 unreferenced object 0xedeca000 (size 4096): comm "swapper/0", pid 1, jiffies 4294937900 (age 3127.400s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<3acd268d>] arch_setup_dma_ops+0x4c/0x104 [<9f7d2cce>] of_dma_configure+0x19c/0x3a4 [<ba07704b>] really_probe+0xb0/0x47c [<4f510e4f>] driver_probe_device+0x78/0x1c4 [<7481a0cf>] device_driver_attach+0x58/0x60 [<0ff8f5c1>] __driver_attach+0xb8/0x158 [<86006144>] bus_for_each_dev+0x74/0xb4 [<10159dca>] bus_add_driver+0x1c0/0x200 [<8a265265>] driver_register+0x74/0x108 [<e0f3451a>] exynos_drm_init+0xb0/0x134 [<db3fc7ba>] do_one_initcall+0x90/0x458 [<6da35917>] kernel_init_freeable+0x188/0x200 [<db3f74d4>] kernel_init+0x8/0x110 [<1f3cddf9>] ret_from_fork+0x14/0x20 [<8cd12507>] 0x0 unreferenced object 0xc214a300 (size 128): comm "swapper/0", pid 1, jiffies 4294937900 (age 3127.400s) hex dump (first 32 bytes): 00 a3 14 c2 00 a3 14 c2 00 40 18 c2 00 80 18 c2 .........@...... 02 00 02 00 ad 4e ad de ff ff ff ff ff ff ff ff .....N.......... backtrace: [<08cbd8bc>] iommu_domain_alloc+0x24/0x50 [<b835abee>] arm_iommu_create_mapping+0xe4/0x134 [<3acd268d>] arch_setup_dma_ops+0x4c/0x104 [<9f7d2cce>] of_dma_configure+0x19c/0x3a4 [<ba07704b>] really_probe+0xb0/0x47c [<4f510e4f>] driver_probe_device+0x78/0x1c4 [<7481a0cf>] device_driver_attach+0x58/0x60 [<0ff8f5c1>] __driver_attach+0xb8/0x158 [<86006144>] bus_for_each_dev+0x74/0xb4 [<10159dca>] bus_add_driver+0x1c0/0x200 [<8a265265>] driver_register+0x74/0x108 [<e0f3451a>] exynos_drm_init+0xb0/0x134 [<db3fc7ba>] do_one_initcall+0x90/0x458 [<6da35917>] kernel_init_freeable+0x188/0x200 [<db3f74d4>] kernel_init+0x8/0x110 [<1f3cddf9>] ret_from_fork+0x14/0x20 Signed-off-by:
Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by:
Lukasz Luba <lukasz.luba@arm.com> Signed-off-by:
Inki Dae <inki.dae@samsung.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Hawking Zhang authored
[ Upstream commit f1c2cd3f ] The ROMC_INDEX/DATA offset was changed to e4/e5 since from smuio_v11 (vega20/arcturus). Signed-off-by:
Hawking Zhang <Hawking.Zhang@amd.com> Tested-by:
Candice Li <Candice.Li@amd.com> Reviewed-by:
Candice Li <Candice.Li@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Martin Leung authored
[ Upstream commit d5349775 ] [why] nv14 previously inherited soc bb from generic dcn 2, did not match watermark values according to memory team [how] add nv14 specific soc bb: copy nv2 generic that it was using from before, but changed num channels to 8 Signed-off-by:
Martin Leung <martin.leung@amd.com> Reviewed-by:
Jun Lei <Jun.Lei@amd.com> Acked-by:
Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jiang Lidong authored
[ Upstream commit e25d5dbc ] When local NET_RX backlog is full due to traffic overrun, peer veth tx_dropped counter increases. At that time, list local veth stats, rx_dropped has double value of peer tx_dropped, even bigger than transmit packets by peer. In NET_RX softirq process, if any packet drop case happens, it increases dev's rx_dropped counter and returns NET_RX_DROP. At veth tx side, it records any error returned from peer netif_rx into local dev tx_dropped counter. In veth get stats process, it puts local dev rx_dropped and peer dev tx_dropped into together as local rx_drpped value. So that it shows double value of real dropped packets number in this case. This patch ignores peer tx_dropped when counting local rx_dropped, since peer tx_dropped is duplicated to local rx_dropped at most cases. Signed-off-by:
Jiang Lidong <jianglidong3@jd.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Tycho Andersen authored
[ Upstream commit 2e5383d7 ] Older (and maybe current) versions of systemd set release_agent to "" when shutting down, but do not set notify_on_release to 0. Since 64e90a8a ("Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()"), we filter out such calls when the user mode helper path is "". However, when used in conjunction with an actual (i.e. non "") STATIC_USERMODEHELPER, the path is never "", so the real usermode helper will be called with argv[0] == "". Let's avoid this by not invoking the release_agent when it is "". Signed-off-by:
Tycho Andersen <tycho@tycho.ws> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Dajun Jin authored
[ Upstream commit 209c65b6 ] When registers a phy_device successful, should terminate the loop or the phy_device would be registered in other addr. If there are multiple PHYs without reg properties, it will go wrong. Signed-off-by:
Dajun Jin <adajunjin@gmail.com> Reviewed-by:
Andrew Lunn <andrew@lunn.ch> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mike Gilbert authored
[ Upstream commit 2de7fb60 ] Building cpupower with -fno-common in CFLAGS results in errors due to multiple definitions of the 'cpu_count' and 'start_time' variables. ./utils/idle_monitor/snb_idle.o:./utils/idle_monitor/cpupower-monitor.h:28: multiple definition of `cpu_count'; ./utils/idle_monitor/nhm_idle.o:./utils/idle_monitor/cpupower-monitor.h:28: first defined here ... ./utils/idle_monitor/cpuidle_sysfs.o:./utils/idle_monitor/cpuidle_sysfs.c:22: multiple definition of `start_time'; ./utils/idle_monitor/amd_fam14h_idle.o:./utils/idle_monitor/amd_fam14h_idle.c:85: first defined here The -fno-common option will be enabled by default in GCC 10. Bug: https://bugs.gentoo.org/707462 Signed-off-by:
Mike Gilbert <floppym@gentoo.org> Signed-off-by:
Shuah Khan <skhan@linuxfoundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Scott Mayhew authored
[ Upstream commit 55dee1bc ] An NFS client that mounts multiple exports from the same NFS server with higher NFSv4 versions disabled (i.e. 4.2) and without forcing a specific NFS version results in fscache index cookie collisions and the following messages: [ 570.004348] FS-Cache: Duplicate cookie detected Each nfs_client structure should have its own fscache index cookie, so add the minorversion to nfs_server_key. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200145 Signed-off-by:
Scott Mayhew <smayhew@redhat.com> Signed-off-by:
Dave Wysochanski <dwysocha@redhat.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Vasily Averin authored
[ Upstream commit db8dd969 ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. # mount | grep cgroup # dd if=/mnt/cgroup.procs bs=1 # normal output ... 1294 1295 1296 1304 1382 584+0 records in 584+0 records out 584 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 83 <<< generates end of last line 1383 <<< ... and whole last line once again 0+1 records in 0+1 records out 8 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 1386 <<< generates last line anyway 0+1 records in 0+1 records out 5 bytes copied https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Sebastian Hense authored
[ Upstream commit 404402ab ] The mask value is provided as 64 bit and has to be casted in either 32 or 16 bit. On big endian systems the wrong half was casted which resulted in an all zero mask. Fixes: 2b64beba ("net/mlx5e: Support header re-write of partial fields in TC pedit offload") Signed-off-by:
Sebastian Hense <sebastian.hense1@ibm.com> Reviewed-by:
Roi Dayan <roid@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tariq Toukan authored
[ Upstream commit 56917766 ] We have an off-by-1 issue in the TCP seq comparison. The last sequence number that belongs to the TCP packet's payload is not "start_seq + len", but one byte before it. Fix it so the 'ends_before' is evaluated properly. This fixes a bug that results in error completions in the kTLS HW offload flows. Fixes: ffbd9ca9 ("net/mlx5e: kTLS, Fix corner-case checks in TX resync flow") Signed-off-by:
Tariq Toukan <tariqt@mellanox.com> Reviewed-by:
Boris Pismenny <borisp@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Leon Romanovsky authored
[ Upstream commit 306f354c ] The cap_mask1 isn't protected by field_select and not listed among RW fields, but it is required to be written to properly initialize ports in IB virtualization mode. Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com Fixes: ab118da4 ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command") Signed-off-by:
Leon Romanovsky <leonro@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aya Levin authored
[ Upstream commit 187a9830 ] For non-fatal syndromes like LOCAL_LENGTH_ERR, recovery shouldn't be triggered. In these scenarios, the RQ is not actually in ERR state. This misleads the recovery flow which assumes that the RQ is really in error state and no more completions arrive, causing crashes on bad page state. Fixes: 8276ea13 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by:
Aya Levin <ayal@mellanox.com> Reviewed-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aya Levin authored
[ Upstream commit e239c6d6 ] In striding RQ mode, the buffers of an RX WQE are first prepared and posted to the HW using a UMR WQEs via the ICOSQ. We maintain the state of these in-progress WQEs in the RQ SW struct. In the flow of ICOSQ recovery, the corresponding RQ is not in error state, hence: - The buffers of the in-progress WQEs must be released and the RQ metadata should reflect it. - Existing RX WQEs in the RQ should not be affected. For this, wrap the dealloc of the in-progress WQEs in a function, and use it in the ICOSQ recovery flow instead of mlx5e_free_rx_descs(). Fixes: be5323c8 ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by:
Aya Levin <ayal@mellanox.com> Reviewed-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aya Levin authored
[ Upstream commit 39369fd5 ] When resetting the RQ (moving RQ state from RST to RDY), the driver resets the WQ's SW metadata. In striding RQ mode, we maintain a field that reflects the actual expected WQ head (including in progress WQEs posted to the ICOSQ). It was mistakenly not reset together with the WQ. Fix this here. Fixes: 8276ea13 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by:
Aya Levin <ayal@mellanox.com> Reviewed-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aya Levin authored
[ Upstream commit 1de0306c ] Add number of WQEBBs (WQE's Basic Block) to WQE info struct. Set the number of WQEBBs on WQE post, and increment the consumer counter (cc) on completion. In case of error completions, the cc was mistakenly not incremented, keeping a gap between cc and pc (producer counter). This failed the recovery flow on the ICOSQ from a CQE error which timed-out waiting for the cc and pc to meet. Fixes: be5323c8 ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by:
Aya Levin <ayal@mellanox.com> Reviewed-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hamdan Igbaria authored
[ Upstream commit 692b0399 ] Fix the send info write length to be (actions x action) size in bytes. Fixes: 297ccceb ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by:
Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by:
Alex Vesker <valex@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Taehee Yoo authored
[ Upstream commit 09e91dbe ] The hsr module has been supporting the list and status command. (HSR_C_GET_NODE_LIST and HSR_C_GET_NODE_STATUS) These commands send node information to the user-space via generic netlink. But, in the non-init_net namespace, these commands are not allowed because .netnsok flag is false. So, there is no way to get node information in the non-init_net namespace. Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by:
Taehee Yoo <ap420073@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Taehee Yoo authored
[ Upstream commit ca19c70f ] The hsr_get_node_list() is to send node addresses to the userspace. If there are so many nodes, it could fail because of buffer size. In order to avoid this failure, the restart routine is added. Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by:
Taehee Yoo <ap420073@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Taehee Yoo authored
[ Upstream commit 173756b8 ] hsr_get_node_{list/status}() are not under rtnl_lock() because they are callback functions of generic netlink. But they use __dev_get_by_index() without rtnl_lock(). So, it would use unsafe data. In order to fix it, rcu_read_lock() and dev_get_by_index_rcu() are used instead of __dev_get_by_index(). Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by:
Taehee Yoo <ap420073@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Petr Machata authored
[ Upstream commit 32ca98fe ] The fix referenced below causes a crash when an ERSPAN tunnel is created without passing IFLA_INFO_DATA. Fix by validating passed-in data in the same way as ipgre does. Fixes: e1f8f78f ("net: ip_gre: Separate ERSPAN newlink / changelink callbacks") Reported-by:
<syzbot+1b4ebf4dae4e510dd219@syzkaller.appspotmail.com> Signed-off-by:
Petr Machata <petrm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Petr Machata authored
[ Upstream commit e1f8f78f ] ERSPAN shares most of the code path with GRE and gretap code. While that helps keep the code compact, it is also error prone. Currently a broken userspace can turn a gretap tunnel into a de facto ERSPAN one by passing IFLA_GRE_ERSPAN_VER. There has been a similar issue in ip6gretap in the past. To prevent these problems in future, split the newlink and changelink code paths. Split the ERSPAN code out of ipgre_netlink_parms() into a new function erspan_netlink_parms(). Extract a piece of common logic from ipgre_newlink() and ipgre_changelink() into ipgre_newlink_encap_setup(). Add erspan_newlink() and erspan_changelink(). Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by:
Petr Machata <petrm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arthur Kiyanovski authored
[ Upstream commit dfdde134 ] last_keep_alive_jiffies is updated in probe and when a keep-alive event is received. In case the driver times-out on a keep-alive event, it has high chances of continuously timing-out on keep-alive events. This is because when the driver recovers from the keep-alive-timeout reset the value of last_keep_alive_jiffies is very old, and if a keep-alive event is not received before the next timer expires, the value of last_keep_alive_jiffies will cause another keep-alive-timeout reset and so forth in a loop. Solution: Update last_keep_alive_jiffies whenever the device is restored after reset. Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by:
Noam Dagan <ndagan@amazon.com> Signed-off-by:
Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arthur Kiyanovski authored
[ Upstream commit 30623e1e ] Rx req_id is an index in struct ena_eth_io_rx_cdesc_base. The driver should validate that the Rx req_id it received from the device is in range [0, ring_size -1]. Failure to do so could yield to potential memory access violoation. The validation was mistakenly done when refilling the Rx submission queue and not in Rx completion queue. Fixes: ad974bae ("net: ena: add support for out of order rx buffers refill") Signed-off-by:
Noam Dagan <ndagan@amazon.com> Signed-off-by:
Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arthur Kiyanovski authored
[ Upstream commit e02ae6ed ] Bug: In short the main issue is caused by the fact that the number of queues is changed using ethtool after ena_probe() has been called and before ena_up() was executed. Here is the full scenario in detail: * ena_probe() is called when the driver is loaded, the driver is not up yet at the end of ena_probe(). * The number of queues is changed -> io_queue_count is changed as well - ena_up() is not called since the "dev_was_up" boolean in ena_update_queue_count() is false. * ena_up() is called by the kernel (it's called asynchronously some time after ena_probe()). ena_setup_io_intr() is called by ena_up() and it uses io_queue_count to get the suitable irq lines for each msix vector. The function ena_request_io_irq() is called right after that and it uses msix_vecs - This value only changes during ena_probe() and ena_restore() - to request the irq vectors. This results in "Failed to request I/O IRQ" error for i > io_queue_count. Numeric example: * After ena_probe() io_queue_count = 8, msix_vecs = 9. * The number of queues changes to 4 -> io_queue_count = 4, msix_vecs = 9. * ena_up() is executed for the first time: ** ena_setup_io_intr() inits the vectors only up to io_queue_count. ** ena_request_io_irq() calls request_irq() and fails for i = 5. How to reproduce: simply run the following commands: sudo rmmod ena && sudo insmod ena.ko; sudo ethtool -L eth1 combined 3; Fix: Use ENA_MAX_MSIX_VEC(adapter->num_io_queues + adapter->xdp_num_queues) instead of adapter->msix_vecs. We need to take XDP queues into consideration as they need to have msix vectors assigned to them as well. Note that the XDP cannot be attached before the driver is up and running but in XDP mode the issue might occur when the number of queues changes right after a reset trigger. The ENA_MAX_MSIX_VEC simply adds one to the argument since the first msix vector is reserved for management queue. Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by:
Sameeh Jubran <sameehj@amazon.com> Signed-off-by:
Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arthur Kiyanovski authored
[ Upstream commit ce1f3521 ] Overview: We don't frequently change the msix vectors throughout the life cycle of the driver. We do so in two functions: ena_probe() and ena_restore(). ena_probe() is only called when the driver is loaded. ena_restore() on the other hand is called during device reset / resume operations. We use num_io_queues for calculating and allocating the number of msix vectors. At ena_probe() this value is equal to max_num_io_queues and thus this is not an issue, however ena_restore() might be called after the number of io queues has changed. A possible bug scenario is as follows: * Change number of queues from 8 to 4. (num_io_queues = 4, max_num_io_queues = 8, msix_vecs = 9,) * Trigger reset occurs -> ena_restore is called. (num_io_queues = 4, max_num_io_queues =8 , msix_vecs = 5) * Change number of queues from 4 to 6. (num_io_queues = 6, max_num_io_queues = 8, msix_vecs = 5) * The driver will reset due to failure of check_for_rx_interrupt_queue() Fix: This can be easily fixed by always using max_num_io_queues to init the msix_vecs, since this number won't change as opposed to num_io_queues. Fixes: 4d192660 ("net: ena: multiple queue creation related cleanups") Signed-off-by:
Sameeh Jubran <sameehj@amazon.com> Signed-off-by:
Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasundhara Volam authored
[ Upstream commit 5d765a5e ] If ring counts are not reset when ring reservation fails, bnxt_init_dflt_ring_mode() will not be called again to reinitialise IRQs when open() is called and results in system crash as napi will also be not initialised. This patch fixes it by resetting the ring counts. Fixes: 47558acd ("bnxt_en: Reserve rings at driver open if none was reserved at probe time.") Signed-off-by:
Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Chan authored
[ Upstream commit 62bfb932 ] Other shutdown code paths will always disable PCI first to shutdown DMA before freeing context memory. Do the same sequence in the error path of probe to be safe and consistent. Fixes: c20dc142 ("bnxt_en: Disable bus master during PCI shutdown and driver unload.") Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Chan authored
[ Upstream commit 0b5b561c ] The current code ignores the return value from bnxt_hwrm_func_backing_store_cfg(), causing the driver to proceed in the init path even when this vital firmware call has failed. Fix it by propagating the error code to the caller. Fixes: 1b9394e5 ("bnxt_en: Configure context memory on new devices.") Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Edwin Peer authored
[ Upstream commit 62d4073e ] The allocated ieee_ets structure goes out of scope without being freed, leaking memory. Appropriate result codes should be returned so that callers do not rely on invalid data passed by reference. Also cache the ETS config retrieved from the device so that it doesn't need to be freed. The balance of the code was clearly written with the intent of having the results of querying the hardware cached in the device structure. The commensurate store was evidently missed though. Fixes: 7df4ae9f ("bnxt_en: Implement DCBNL to support host-based DCBX.") Signed-off-by:
Edwin Peer <edwin.peer@broadcom.com> Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michael Chan authored
[ Upstream commit a24ec322 ] There is an indexing bug in determining these ethtool priority counters. Instead of using the queue ID to index, we need to normalize by modulo 10 to get the index. This index is then used to obtain the proper CoS queue counter. Rename bp->pri2cos to bp->pri2cos_idx to make this more clear. Fixes: e37fed79 ("bnxt_en: Add ethtool -S priority counters.") Signed-off-by:
Michael Chan <michael.chan@broadcom.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Doug Berger authored
[ Upstream commit 88f6c8bf ] As noted in commit 28c2d1a7 ("net: bcmgenet: enable loopback during UniMAC sw_reset") the UniMAC must be clocked at least 5 cycles while the sw_reset is asserted to ensure a clean reset. That commit enabled local loopback to provide an Rx clock from the GENET sourced Tx clk. However, when connected in MII mode the Tx clk is sourced by the PHY so if an EPHY is not supplying clocks (e.g. when the link is down) the UniMAC does not receive the necessary clocks. This commit extends the sw_reset window until the PHY reports that the link is up thereby ensuring that the clocks are being provided to the MAC to produce a clean reset. One consequence is that if the system attempts to enter a Wake on LAN suspend state when the PHY link has not been active the MAC may not have had a chance to initialize cleanly. In this case, we remove the sw_reset and enable the WoL reception path as normal with the hope that the PHY will provide the necessary clocks to drive the WoL blocks if the link becomes active after the system has entered suspend. Fixes: 1c1008c7 ("net: bcmgenet: add main driver file") Signed-off-by:
Doug Berger <opendmb@gmail.com> Acked-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Doug Berger authored
[ Upstream commit 612eb1c3 ] This reverts commit 3a55402c . This is not a good solution when connecting to an external switch that may not support the isolation of the TXC signal resulting in output driver contention on the pin. A different solution is necessary. Signed-off-by:
Doug Berger <opendmb@gmail.com> Acked-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Taehee Yoo authored
[ Upstream commit 384d91c2 ] gro_cells_init() returns error if memory allocation is failed. But the vxlan module doesn't check the return value of gro_cells_init(). Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")` Signed-off-by:
Taehee Yoo <ap420073@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 6cd6cbf5 ] When application uses TCP_QUEUE_SEQ socket option to change tp->rcv_next, we must also update tp->copied_seq. Otherwise, stuff relying on tcp_inq() being precise can eventually be confused. For example, tcp_zerocopy_receive() might crash because it does not expect tcp_recv_skb() to return NULL. We could add tests in various places to fix the issue, or simply make sure tcp_inq() wont return a random value, and leave fast path as it is. Note that this fixes ioctl(fd, SIOCINQ, &val) at the same time. Fixes: ee995283 ("tcp: Initial repair mode") Fixes: 05255b82 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
syzbot <syzkaller@googlegroups.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-