- Dec 08, 2023
-
-
Samuel Holland authored
[ Upstream commit fd0413bb ] Due to a typo, the code checked the RX checksum feature in the TX path. Fixes: 8a3b7a25 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by:
Samuel Holland <samuel.holland@sifive.com> Reviewed-by:
Andrew Lunn <andrew@lunn.ch> Reviewed-by:
Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/20231122004219.3504219-1-samuel.holland@sifive.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Raju Rangoju authored
[ Upstream commit 7a2323ac ] xgbe_get_link_ksettings() does not propagate correct speed and duplex information to ethtool during cable unplug. Due to which ethtool reports incorrect values for speed and duplex. Address this by propagating correct information. Fixes: 7c12aa08 ("amd-xgbe: Move the PHY support into amd-xgbe") Acked-by:
Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by:
Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by:
Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Raju Rangoju authored
[ Upstream commit 7121205d ] The existing implementation uses software logic to accumulate tx completions until the specified time (1ms) is met and then poll them. However, there exists a tiny gap which leads to a race between resetting and checking the tx_activate flag. Due to this the tx completions are not reported to upper layer and tx queue timeout kicks-in restarting the device. To address this, introduce a tx cleanup mechanism as part of the periodic maintenance process. Fixes: c5aa9e3b ("amd-xgbe: Initial AMD 10GbE platform driver") Acked-by:
Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by:
Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by:
Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Raju Rangoju authored
[ Upstream commit 676ec538 ] Force the mode change for SFI in Fixed PHY configurations. Fixed PHY configurations needs PLL to be enabled while doing mode set. When the SFP module isn't connected during boot, driver assumes AN is ON and attempts auto-negotiation. However, if the connected SFP comes up in Fixed PHY configuration the link will not come up as PLL isn't enabled while the initial mode set command is issued. So, force the mode change for SFI in Fixed PHY configuration to fix link issues. Fixes: e57f7a3f ("amd-xgbe: Prepare for working with more than one type of phy") Acked-by:
Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by:
Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by:
Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Stefano Stabellini authored
[ Upstream commit 7bf9a6b4 ] xen_vcpu_info is a percpu area than needs to be mapped by Xen. Currently, it could cross a page boundary resulting in Xen being unable to map it: [ 0.567318] kernel BUG at arch/arm64/xen/../../arm/xen/enlighten.c:164! [ 0.574002] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Fix the issue by using __alloc_percpu and requesting alignment for the memory allocation. Signed-off-by:
Stefano Stabellini <stefano.stabellini@amd.com> Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2311221501340.2053963@ubuntu-linux-20-04-desktop Fixes: 24d5373d ("arm/xen: Use alloc_percpu rather than __alloc_percpu") Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jose Ignacio Tornos Martinez authored
[ Upstream commit 0739af07 ] Using generic ASIX Electronics Corp. AX88179 Gigabit Ethernet device, the following test cycle has been implemented: - power on - check logs - shutdown - after detecting the system shutdown, disconnect power - after approximately 60 seconds of sleep, power is restored Running some cycles, sometimes error logs like this appear: kernel: ax88179_178a 2-9:1.0 (unnamed net_device) (uninitialized): Failed to write reg index 0x0001: -19 kernel: ax88179_178a 2-9:1.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x0001: -19 ... These failed operation are happening during ax88179_reset execution, so the initialization could not be correct. In order to avoid this, we need to increase the delay after reset and clock initial operations. By using these larger values, many cycles have been run and no failed operations appear. It would be better to check some status register to verify when the operation has finished, but I do not have found any available information (neither in the public datasheets nor in the manufacturer's driver). The only available information for the necessary delays is the maufacturer's driver (original values) but the proposed values are not enough for the tested devices. Fixes: e2ca90c2 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver") Reported-by:
Herb Wei <weihao.bj@ieisystem.com> Tested-by:
Herb Wei <weihao.bj@ieisystem.com> Signed-off-by:
Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Link: https://lore.kernel.org/r/20231120120642.54334-1-jtornosm@redhat.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Kunwu Chan authored
[ Upstream commit c0e29262 ] net/ipv4/route.c:783:46: warning: incorrect type in argument 2 (different base types) net/ipv4/route.c:783:46: expected unsigned int [usertype] key net/ipv4/route.c:783:46: got restricted __be32 [usertype] new_gw Fixes: 969447f2 ("ipv4: use new_gw for redirect neigh lookup") Suggested-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
Kunwu Chan <chentao@kylinos.cn> Link: https://lore.kernel.org/r/20231119141759.420477-1-chentao@kylinos.cn Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Charles Yi authored
[ Upstream commit fc43e9c8 ] hid_debug_events_release releases resources bound to the HID device instance. hid_device_release releases the underlying HID device instance potentially before hid_debug_events_release has completed releasing debug resources bound to the same HID device instance. Reference count to prevent the HID device instance from being torn down preemptively when HID debugging support is used. When count reaches zero, release core resources of HID device instance using hiddev_free. The crash: [ 120.728477][ T4396] kernel BUG at lib/list_debug.c:53! [ 120.728505][ T4396] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 120.739806][ T4396] Modules linked in: bcmdhd dhd_static_buf 8822cu pcie_mhi r8168 [ 120.747386][ T4396] CPU: 1 PID: 4396 Comm: hidt_bridge Not tainted 5.10.110 #257 [ 120.754771][ T4396] Hardware name: Rockchip RK3588 EVB4 LP4 V10 Board (DT) [ 120.761643][ T4396] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 120.768338][ T4396] pc : __list_del_entry_valid+0x98/0xac [ 120.773730][ T4396] lr : __list_del_entry_valid+0x98/0xac [ 120.779120][ T4396] sp : ffffffc01e62bb60 [ 120.783126][ T4396] x29: ffffffc01e62bb60 x28: ffffff818ce3a200 [ 120.789126][ T4396] x27: 0000000000000009 x26: 0000000000980000 [ 120.795126][ T4396] x25: ffffffc012431000 x24: ffffff802c6d4e00 [ 120.801125][ T4396] x23: ffffff8005c66f00 x22: ffffffc01183b5b8 [ 120.807125][ T4396] x21: ffffff819df2f100 x20: 0000000000000000 [ 120.813124][ T4396] x19: ffffff802c3f0700 x18: ffffffc01d2cd058 [ 120.819124][ T4396] x17: 0000000000000000 x16: 0000000000000000 [ 120.825124][ T4396] x15: 0000000000000004 x14: 0000000000003fff [ 120.831123][ T4396] x13: ffffffc012085588 x12: 0000000000000003 [ 120.837123][ T4396] x11: 00000000ffffbfff x10: 0000000000000003 [ 120.843123][ T4396] x9 : 455103d46b329300 x8 : 455103d46b329300 [ 120.849124][ T4396] x7 : 74707572726f6320 x6 : ffffffc0124b8cb5 [ 120.855124][ T4396] x5 : ffffffffffffffff x4 : 0000000000000000 [ 120.861123][ T4396] x3 : ffffffc011cf4f90 x2 : ffffff81fee7b948 [ 120.867122][ T4396] x1 : ffffffc011cf4f90 x0 : 0000000000000054 [ 120.873122][ T4396] Call trace: [ 120.876259][ T4396] __list_del_entry_valid+0x98/0xac [ 120.881304][ T4396] hid_debug_events_release+0x48/0x12c [ 120.886617][ T4396] full_proxy_release+0x50/0xbc [ 120.891323][ T4396] __fput+0xdc/0x238 [ 120.895075][ T4396] ____fput+0x14/0x24 [ 120.898911][ T4396] task_work_run+0x90/0x148 [ 120.903268][ T4396] do_exit+0x1bc/0x8a4 [ 120.907193][ T4396] do_group_exit+0x8c/0xa4 [ 120.911458][ T4396] get_signal+0x468/0x744 [ 120.915643][ T4396] do_signal+0x84/0x280 [ 120.919650][ T4396] do_notify_resume+0xd0/0x218 [ 120.924262][ T4396] work_pending+0xc/0x3f0 [ Rahul Rameshbabu <sergeantsagara@protonmail.com>: rework changelog ] Fixes: cd667ce2 ("HID: use debugfs for events/reports dumping") Signed-off-by:
Charles Yi <be286@163.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Benjamin Tissoires authored
[ Upstream commit 1e839143 ] This unique identifier is currently used only for ensuring uniqueness in sysfs. However, this could be handful for userspace to refer to a specific hid_device by this id. 2 use cases are in my mind: LEDs (and their naming convention), and HID-BPF. Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220902132938.2409206-9-benjamin.tissoires@redhat.com Stable-dep-of: fc43e9c8 ("HID: fix HID device resource race between HID core and debugging support") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jonas Karlman authored
[ Upstream commit bb0a05ac ] Use of DRM_FORMAT_RGB888 and DRM_FORMAT_BGR888 on e.g. RK3288, RK3328 and RK3399 result in wrong colors being displayed. The issue can be observed using modetest: modetest -s <connector_id>@<crtc_id>:1920x1080-60@RG24 modetest -s <connector_id>@<crtc_id>:1920x1080-60@BG24 Vendor 4.4 kernel apply an inverted rb swap for these formats on VOP full framework (IP version 3.x) compared to VOP little framework (2.x). Fix colors by applying different rb swap for VOP full framework (3.x) and VOP little framework (2.x) similar to vendor 4.4 kernel. Fixes: 85a359f2 ("drm/rockchip: Add BGR formats to VOP") Signed-off-by:
Jonas Karlman <jonas@kwiboo.se> Tested-by:
Diederik de Haas <didi.debian@cknow.org> Reviewed-by:
Christopher Obbard <chris.obbard@collabora.com> Tested-by:
Christopher Obbard <chris.obbard@collabora.com> Signed-off-by:
Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231026191500.2994225-1-jonas@kwiboo.se Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
倪琛 authored
[ Upstream commit a6925165 ] Add missing error return check for devm_ioport_map() and return the error if this function call fails. Fixes: 0d5ff566 ("libata: convert to iomap") Signed-off-by:
Chen Ni <nichen@iscas.ac.cn> Reviewed-by:
Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by:
Damien Le Moal <dlemoal@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Marek Vasut authored
[ Upstream commit 3f9a91b6 ] The Innolux G101ICE-L01 datasheet [1] page 17 table 6.1 INPUT SIGNAL TIMING SPECIFICATIONS indicates that maximum vertical blanking time is 40 lines. Currently the driver uses 29 lines. Fix it, and since this panel is a DE panel, adjust the timings to make them less hostile to controllers which cannot do 1 px HSA/VSA, distribute the delays evenly between all three parts. [1] https://www.data-modul.com/sites/default/files/products/G101ICE-L01-C2-specification-12042389.pdf Fixes: 1e29b840 ("drm/panel: simple: Add Innolux G101ICE-L01 panel") Signed-off-by:
Marek Vasut <marex@denx.de> Reviewed-by:
Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231008223256.279196-1-marex@denx.de Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Marek Vasut authored
[ Upstream commit 06fc41b0 ] Add missing .bus_flags = DRM_BUS_FLAG_DE_HIGH to this panel description, ones which match both the datasheet and the panel display_timing flags . Fixes: 1e29b840 ("drm/panel: simple: Add Innolux G101ICE-L01 panel") Signed-off-by:
Marek Vasut <marex@denx.de> Reviewed-by:
Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231008223315.279215-1-marex@denx.de Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
David Howells authored
[ Upstream commit 2a4ca1b4 ] When kafs tries to look up a cell in the DNS or the local config, it will translate a lookup failure into EDESTADDRREQ whereas OpenAFS translates it into ENOENT. Applications such as West expect the latter behaviour and fail if they see the former. This can be seen by trying to mount an unknown cell: # mount -t afs %example.com:cell.root /mnt mount: /mnt: mount(2) system call failed: Destination address required. Fixes: 4d673da1 ("afs: Support the AFS dynamic root") Reported-by:
Markus Suvanto <markus.suvanto@gmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216637 Signed-off-by:
David Howells <dhowells@redhat.com> Reviewed-by:
Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Nathan Chancellor authored
This commit has no upstream equivalent. After commit 012dba0a ("PCI: keystone: Don't discard .probe() callback") in 5.4.262, there are two modpost warnings when building with clang: WARNING: modpost: vmlinux.o(.text+0x5aa6dc): Section mismatch in reference from the function ks_pcie_probe() to the function .init.text:ks_pcie_add_pcie_port() The function ks_pcie_probe() references the function __init ks_pcie_add_pcie_port(). This is often because ks_pcie_probe lacks a __init annotation or the annotation of ks_pcie_add_pcie_port is wrong. WARNING: modpost: vmlinux.o(.text+0x5aa6f4): Section mismatch in reference from the function ks_pcie_probe() to the function .init.text:ks_pcie_add_pcie_ep() The function ks_pcie_probe() references the function __init ks_pcie_add_pcie_ep(). This is often because ks_pcie_probe lacks a __init annotation or the annotation of ks_pcie_add_pcie_ep is wrong. ks_pcie_add_pcie_ep() was removed in upstream commit a0fd361d ("PCI: dwc: Move "dbi", "dbi2", and "addr_space" resource setup into common code") and ks_pcie_add_pcie_port() was removed in upstream commit 60f5b73f ("PCI: dwc: Remove unnecessary wrappers around dw_pcie_host_init()"), both of which happened before upstream commit 7994db90 ("PCI: keystone: Don't discard .probe() callback"). As neither of these removal changes are really suitable for stable, just remove __init from these functions in stable, as it is no longer a correct annotation after dropping __init from ks_pcie_probe(). Fixes: 012dba0a ("PCI: keystone: Don't discard .probe() callback") Reported-by:
Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by:
Nathan Chancellor <nathan@kernel.org> Reviewed-by:
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christopher Bednarz authored
commit bb6d73d9 upstream. Currently irdma allows zero-length STAGs to be programmed in HW during the kernel mode fast register flow. Zero-length MR or STAG registration disable HW memory length checks. Improve gaps in bounds checking in irdma by preventing zero-length STAG or MR registrations except if the IB_PD_UNSAFE_GLOBAL_RKEY is set. This addresses the disclosure CVE-2023-25775. Fixes: b48c24c2 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by:
Christopher Bednarz <christopher.n.bednarz@intel.com> Signed-off-by:
Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230818144838.1758-1-shiraz.saleem@intel.com Signed-off-by:
Leon Romanovsky <leon@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Saravana Kannan authored
commit 2e84dc37 upstream. This commit fixes a bug in commit 9ed98953 ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released. Fixes: 9ed98953 ("driver core: Functional dependencies tracking support") Cc: stable <stable@kernel.org> Reported-by:
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by:
Saravana Kannan <saravanak@google.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Matti Vaittinen <mazziesaccount@gmail.com> Cc: James Clark <james.clark@arm.com> Acked-by:
"Rafael J. Wysocki" <rafael@kernel.org> Tested-by:
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by:
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by:
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Nov 29, 2023
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20231124171941.909624388@linuxfoundation.org Link: https://lore.kernel.org/r/20231125163112.419066112@linuxfoundation.org Link: https://lore.kernel.org/r/20231125194331.369464812@linuxfoundation.org Tested-by:
Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231126154329.848261327@linuxfoundation.org Tested-by:
Florian Fainelli <florian.fainelli@broadcom.com> Tested-by:
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
3f0465a9 ("netfilter: nf_tables: dynamically allocate hooks per net_device in flowtables") reworks flowtable support to allow for dynamic allocation of hooks, which implicitly fixes the following bogus EBUSY in transaction: delete flowtable add flowtable # same flowtable with same devices, it hits EBUSY This patch does not exist in any tree, but it fixes this issue for -stable Linux kernel 5.4 Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit c9bd2651 upstream. nft -f -<<EOF add table ip t add table ip t { flags dormant; } add chain ip t c { type filter hook input priority 0; } add table ip t EOF Triggers a splat from nf core on next table delete because we lose track of right hook register state: WARNING: CPU: 2 PID: 1597 at net/netfilter/core.c:501 __nf_unregister_net_hook RIP: 0010:__nf_unregister_net_hook+0x41b/0x570 nf_unregister_net_hook+0xb4/0xf0 __nf_tables_unregister_hook+0x160/0x1d0 [..] The above should have table in *active* state, but in fact no hooks were registered. Reject on/off/on games rather than attempting to fix this. Fixes: 179d9ba5 ("netfilter: nf_tables: fix table flag updates") Reported-by:
"Lee, Cherie-Anne" <cherie.lee@starlabs.sg> Cc: Bing-Jhong Billy Jheng <billy@starlabs.sg> Cc: info@starlabs.sg Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 179d9ba5 upstream. The dormant flag need to be updated from the preparation phase, otherwise, two consecutive requests to dorm a table in the same batch might try to remove the same hooks twice, resulting in the following warning: hook not found, pf 3 num 0 WARNING: CPU: 0 PID: 334 at net/netfilter/core.c:480 __nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 Modules linked in: CPU: 0 PID: 334 Comm: kworker/u4:5 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:__nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 This patch is a partial revert of 0ce7cf41 ("netfilter: nftables: update table flags from the commit phase") to restore the previous behaviour. However, there is still another problem: A batch containing a series of dorm-wakeup-dorm table and vice-versa also trigger the ...
-
Pablo Neira Ayuso authored
commit 0ce7cf41 upstream. Do not update table flags from the preparation phase. Store the flags update into the transaction, then update the flags from the commit phase. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit f9a43007 upstream. __nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: 767d1216 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 6069da44 upstream. Unregister flowtable hooks before they are releases via nf_tables_flowtable_destroy() otherwise hook core reports UAF. BUG: KASAN: use-after-free in nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142 Read of size 4 at addr ffff8880736f7438 by task syz-executor579/3666 CPU: 0 PID: 3666 Comm: syz-executor579 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] __dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106 dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106 lib/dump_stack.c:106 print_address_description+0x65/0x380 mm/kasan/report.c:247 mm/kasan/report.c:247 __kasan_report mm/kasan/report.c:433 [inline] __kasan_report mm/kasan/report.c:433 [inline] mm/kasan/report.c:450 kasan_report+0x19a/0x1f0 mm/kasan/report.c:450 mm/kasan/report.c:450 nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142 __nf_register_net_hook+0x27e/0x8d0 net/netfilter/core.c:429 net/netfilter/core.c:429 nf_register_net_hook+0xaa/0x180 net/netfilter/core.c:571 net/netfilter/core.c:571 nft_register_flowtable_net_hooks+0x3c5/0x730 net/netfilter/nf_tables_api.c:7232 net/netfilter/nf_tables_api.c:7232 nf_tables_newflowtable+0x2022/0x2cf0 net/netfilter/nf_tables_api.c:7430 net/netfilter/nf_tables_api.c:7430 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] net/netfilter/nfnetlink.c:652 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] net/netfilter/nfnetlink.c:652 nfnetlink_rcv+0x10e6/0x2550 net/netfilter/nfnetlink.c:652 net/netfilter/nfnetlink.c:652 __nft_release_hook() calls nft_unregister_flowtable_net_hooks() which only unregisters the hooks, then after RCU grace period, it is guaranteed that no packets add new entries to the flowtable (no flow offload rules and flowtable hooks are reachable from packet path), so it is safe to call nf_flow_table_free() which cleans up the remaining entries from the flowtable (both software and hardware) and it unbinds the flow_block. Fixes: ff4bf2f4 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()") Reported-by:
<syzbot+e918523f77e62790d6d9@syzkaller.appspotmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit cf5000a7 upstream. When more than 255 elements expired we're supposed to switch to a new gc container structure. This never happens: u8 type will wrap before reaching the boundary and nft_trans_gc_space() always returns true. This means we recycle the initial gc container structure and lose track of the elements that came before. While at it, don't deref 'gc' after we've passed it to call_rcu. Fixes: 5f68718b ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit b079155f upstream. Skip GC run if iterator rewinds to the beginning with EAGAIN, otherwise GC might collect the same element more than once. Fixes: f6c383b8 ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 96b33300 upstream. rbtree GC does not modify the datastructure, instead it collects expired elements and it enqueues a GC transaction. Use a read spinlock instead to avoid data contention while GC worker is running. Fixes: f6c383b8 ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 2ee52ae9 upstream. New elements in this transaction might expired before such transaction ends. Skip sync GC for such elements otherwise commit path might walk over an already released object. Once transaction is finished, async GC will collect such expired element. Fixes: f6c383b8 ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit 8e51830e upstream. Don't queue more gc work, else we may queue the same elements multiple times. If an element is flagged as dead, this can mean that either the previous gc request was invalidated/discarded by a transaction or that the previous request is still pending in the system work queue. The latter will happen if the gc interval is set to a very low value, e.g. 1ms, and system work queue is backlogged. The sets refcount is 1 if no previous gc requeusts are queued, so add a helper for this and skip gc run if old requests are pending. Add a helper for this and skip the gc run in this case. Fixes: f6c383b8 ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 8357bc94 upstream. Use nf_tables_gc_list_lock spinlock, not nf_tables_destroy_list_lock to protect the gc_list. Fixes: 5f68718b ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 72034434 upstream. Abort path is missing a synchronization point with GC transactions. Add GC sequence number hence any GC transaction losing race will be discarded. Fixes: 5f68718b ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 02c6c244 upstream. Use maybe_get_net() since GC workqueue might race with netns exit path. Fixes: 5f68718b ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 6a33d8b7 upstream. Netlink event path is missing a synchronization point with GC transactions. Add GC sequence number update to netns release path and netlink event path, any GC transaction losing race will be discarded. Fixes: 5f68718b ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit a2dd0233 upstream. Ditch it, it has been replace it by the GC transaction API and it has no clients anymore. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit c92db303 upstream. Set on the NFT_SET_ELEM_DEAD_BIT flag on this element, instead of performing element removal which might race with an ongoing transaction. Enable gc when dynamic flag is set on since dynset deletion requires garbage collection after this patch. Fixes: d0a8d877 ("netfilter: nft_dynset: support for element deletion") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit f6c383b8 upstream. Use the GC transaction API to replace the old and buggy gc API and the busy mark approach. No set elements are removed from async garbage collection anymore, instead the _DEAD bit is set on so the set element is not visible from lookup path anymore. Async GC enqueues transaction work that might be aborted and retried later. rbtree and pipapo set backends does not set on the _DEAD bit from the sync GC path since this runs in control plane path where mutex is held. In this case, set elements are deactivated, removed and then released via RCU callback, sync GC never fails. Fixes: 3c4287f6 ("nf_tables: Add set type for arbitrary concatenation of ranges") Fixes: 8d8540c4 ("netfilter: nft_set_rbtree: add timeout support") Fixes: 9d098292 ("netfilter: nft_hash: add support for timeouts") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pablo Neira Ayuso authored
commit 5f68718b upstream. The set types rhashtable and rbtree use a GC worker to reclaim memory. >From system work queue, in periodic intervals, a scan of the table is done. The major caveat here is that the nft transaction mutex is not held. This causes a race between control plane and GC when they attempt to delete the same element. We cannot grab the netlink mutex from the work queue, because the control plane has to wait for the GC work queue in case the set is to be removed, so we get following deadlock: cpu 1 cpu2 GC work transaction comes in , lock nft mutex `acquire nft mutex // BLOCKS transaction asks to remove the set set destruction calls cancel_work_sync() cancel_work_sync will now block forever, because it is waiting for the mutex the caller already owns. This patch adds a new API that deals with garbage collection in two steps: 1) Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT so they are not visible via lookup. Annotate current GC sequence in the GC transaction. Enqueue GC transaction work as soon as it is full. If ruleset is updated, then GC transaction is aborted and retried later. 2) GC work grabs the mutex. If GC sequence has changed then this GC transaction lost race with control plane, abort it as it contains stale references to objects and let GC try again later. If the ruleset is intact, then this GC transaction deactivates and removes the elements and it uses call_rcu() to destroy elements. Note that no elements are removed from GC lockless path, the _DEAD bit is set and pointers are collected. GC catchall does not remove the elements anymore too. There is a new set->dead flag that is set on to abort the GC transaction to deal with set->ops->destroy() path which removes the remaining elements in the set from commit_release, where no mutex is held. To deal with GC when mutex is held, which allows safe deactivate and removal, add sync GC API which releases the set element object via call_rcu(). This is used by rbtree and pipapo backends which also perform garbage collection from control plane path. Since element removal from sets can happen from control plane and element garbage collection/timeout, it is necessary to keep the set structure alive until all elements have been deactivated and destroyed. We cannot do a cancel_work_sync or flush_work in nft_set_destroy because its called with the transaction mutex held, but the aforementioned async work queue might be blocked on the very mutex that nft_set_destroy() callchain is sitting on. This gives us the choice of ABBA deadlock or UaF. To avoid both, add set->refs refcount_t member. The GC API can then increment the set refcount and release it once the elements have been free'd. Set backends are adapted to use the GC transaction API in a follow up patch entitled: ("netfilter: nf_tables: use gc transaction API in set backends") This is joint work with Florian Westphal. Fixes: cfed7e1b ("netfilter: nf_tables: add set garbage collection helpers") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit 24138933 upstream. There is an asymmetry between commit/abort and preparation phase if the following conditions are met: 1. set is a verdict map ("1.2.3.4 : jump foo") 2. timeouts are enabled In this case, following sequence is problematic: 1. element E in set S refers to chain C 2. userspace requests removal of set S 3. kernel does a set walk to decrement chain->use count for all elements from preparation phase 4. kernel does another set walk to remove elements from the commit phase (or another walk to do a chain->use increment for all elements from abort phase) If E has already expired in 1), it will be ignored during list walk, so its use count won't have been changed. Then, when set is culled, ->destroy callback will zap the element via nf_tables_set_elem_destroy(), but this function is only safe for elements that have been deactivated earlier from the preparation phase: lack of earlier deactivate removes the element but leaks the chain use count, which results in a WARN splat when the chain gets removed later, plus a leak of the nft_chain structure. Update pipapo_get() not to skip expired elements, otherwise flush command reports bogus ENOENT errors. Fixes: 3c4287f6 ("nf_tables: Add set type for arbitrary concatenation of ranges") Fixes: 8d8540c4 ("netfilter: nft_set_rbtree: add timeout support") Fixes: 9d098292 ("netfilter: nft_hash: add support for timeouts") Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit f718863a upstream. The lazy gc on insert that should remove timed-out entries fails to release the other half of the interval, if any. Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0 in nftables.git and kmemleak enabled kernel. Second bug is the use of rbe_prev vs. prev pointer. If rbe_prev() returns NULL after at least one iteration, rbe_prev points to element that is not an end interval, hence it should not be removed. Lastly, check the genmask of the end interval if this is active in the current generation. Fixes: c9e6978e ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit 61ae320a upstream. There is no guarantee that rb_prev() will not return NULL in nft_rbtree_gc_elem(): general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] nft_add_set_elem+0x14b0/0x2990 nf_tables_newsetelem+0x528/0xb30 Furthermore, there is a possible use-after-free while iterating, 'node' can be free'd so we need to cache the next value to use. Fixes: c9e6978e ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-