- Sep 02, 2023
-
-
Helge Deller authored
Vidra Jonas reported issues on parisc with libuv which then triggers build errors with cmake. Debugging shows that those issues stem from io_uring(). I was not able to easily pull in upstream commits directly, so here is IMHO the least invasive manual backport of the following upstream commits to fix the cache aliasing issues on parisc on kernel 6.1 with io_uring: 56675f8b ("io_uring/parisc: Adjust pgoff in io_uring mmap() for parisc") 32832a40 ("io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()") d808459b ("io_uring: Adjust mapping wrt architecture aliasing requirements") With this patch kernel 6.1 has all relevant mmap changes and is identical to kernel 6.5 with regard to mmap() in io_uring. Signed-off-by:
Helge Deller <deller@gmx.de> Reported-by:
<Vidra.Jonas@seznam.cz> Link: https://lore.kernel.org/linux-parisc/520.NvTX.6mXZpmfh4Ju.1awpAS@seznam.cz/ Cc: Sam James <sam@gentoo.org> Cc: John David Anglin <dave.anglin@bell.net> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit b5d89408 upstream. Signed-off-by:
Helge Deller <deller@gmx.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John David Anglin authored
commit 567b3515 upstream. This change simplifies the randomization of file mapping regions. It reworks the code to remove duplication. The flow is now similar to that for mips. Finally, we consistently use the do_color_align variable to determine when color alignment is needed. Tested on rp3440. Signed-off-by:
John David Anglin <dave.anglin@bell.net> Signed-off-by:
Helge Deller <deller@gmx.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Helge Deller authored
commit 0a6b58c5 upstream. On the parisc architecture, lockdep reports for all static objects which are in the __initdata section (e.g. "setup_done" in devtmpfs, "kthreadd_done" in init/main.c) this warning: INFO: trying to register non-static key. The warning itself is wrong, because those objects are in the __initdata section, but the section itself is on parisc outside of range from _stext to _end, which is why the static_obj() functions returns a wrong answer. While fixing this issue, I noticed that the whole existing check can be simplified a lot. Instead of checking against the _stext and _end symbols (which include code areas too) just check for the .data and .bss segments (since we check a data object). This can be done with the existing is_kernel_core_data() macro. In addition objects in the __initdata section can be checked with init_section_contains(), and is_kernel_rodata() allows keys to be in the _ro_after_init section. This partly reverts and simplifies commit bac59d18 ("x86/setup: Fix static memory detection"). Link: https://lkml.kernel.org/r/ZNqrLRaOi/3wPAdp@p100 Fixes: bac59d18 ("x86/setup: Fix static memory detection") Signed-off-by:
Helge Deller <deller@gmx.de> Cc: Borislav Petkov <bp@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
commit a6846234 upstream. Today module_frob_arch_sections() spots init sections from their 'init' prefix, and uses this to keep the init PLTs separate from the rest. get_module_plt() uses within_module_init() to determine if a location is in the init text or not, but this depends on whether core code thought this was an init section. Naturally the logic is different. module_init_layout_section() groups the init and exit text together if module unloading is disabled, as the exit code will never run. The result is kernels with this configuration can't load all their modules because there are not enough PLTs for the combined init+exit section. A previous patch exposed module_init_layout_section(), use that so the logic is the same. Fixes: 055f23b7 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: stable@vger.kernel.org Signed-off-by:
James Morse <james.morse@arm.com> Signed-off-by:
Luis Chamberlain <mcgrof@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
commit f928f8b1 upstream. Today module_frob_arch_sections() spots init sections from their 'init' prefix, and uses this to keep the init PLTs separate from the rest. module_emit_plt_entry() uses within_module_init() to determine if a location is in the init text or not, but this depends on whether core code thought this was an init section. Naturally the logic is different. module_init_layout_section() groups the init and exit text together if module unloading is disabled, as the exit code will never run. The result is kernels with this configuration can't load all their modules because there are not enough PLTs for the combined init+exit section. This results in the following: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194 A previous patch exposed module_init_layout_section(), use that so the logic is the same. Reported-by:
Adam Johnston <adam.johnston@arm.com> Tested-by:
Adam Johnston <adam.johnston@arm.com> Fixes: 055f23b7 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: <stable@vger.kernel.org> # 5.15.x: 60a0aab7 arm64: module-plts: inline linux/moduleloader.h Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by:
James Morse <james.morse@arm.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Luis Chamberlain <mcgrof@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit 60a0aab7 upstream. module_frob_arch_sections() is declared in moduleloader.h, but that is not included before the definition: arch/arm64/kernel/module-plts.c:286:5: error: no previous prototype for 'module_frob_arch_sections' [-Werror=missing-prototypes] Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Acked-by:
Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230516160642.523862-11-arnd@kernel.org Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
commit 2abcc4b5 upstream. module_init_layout_section() choses whether the core module loader considers a section as init or not. This affects the placement of the exit section when module unloading is disabled. This code will never run, so it can be free()d once the module has been initialised. arm and arm64 need to count the number of PLTs they need before applying relocations based on the section name. The init PLTs are stored separately so they can be free()d. arm and arm64 both use within_module_init() to decide which list of PLTs to use when applying the relocation. Because within_module_init()'s behaviour changes when module unloading is disabled, both architecture would need to take this into account when counting the PLTs. Today neither architecture does this, meaning when module unloading is disabled there are insufficient PLTs in the init section to load some modules, resulting in warnings: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194 Instead of duplicating module_init_layout_section()s logic, expose it. Reported-by:
Adam Johnston <adam.johnston@arm.com> Fixes: 055f23b7 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: stable@vger.kernel.org Signed-off-by:
James Morse <james.morse@arm.com> Signed-off-by:
Luis Chamberlain <mcgrof@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mario Limonciello authored
commit 5f641174 upstream. The `nocrt` module parameter has no code associated with it and does nothing. As `crt=-1` has same functionality as what nocrt should be doing drop `nocrt` and associated documentation. This should fix a quirk for Gigabyte GA-7ZX that used `nocrt` and thus didn't function properly. Fixes: 8c99fdce ("ACPI: thermal: set "thermal.nocrt" via DMI on Gigabyte GA-7ZX") Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Aug 30, 2023
-
-
Greg Kroah-Hartman authored
Link: https://lore.kernel.org/r/20230828101156.480754469@linuxfoundation.org Tested-by:
Conor Dooley <conor.dooley@microchip.com> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Salvatore Bonaccorso <carnil@debian.org> Tested-by:
Joel Fernandes (Google) <joel@joelfernandes.org> Tested-by:
SeongJae Park <sj@kernel.org> Tested-by:
Bagas Sanjaya <bagasdotme@gmail.com> Tested-by:
Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by:
Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by:
Shuah Khan <skhan@linuxfoundation.org> Tested-by:
Florian Fainelli <florian.fainelli@broadcom.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Pavel Machek (CIP) <pavel@denx.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit fd0a7ec3 upstream. The vangogh driver just gained a link time dependency that now causes randconfig builds to fail: x86_64-linux-ld: sound/soc/amd/vangogh/pci-acp5x.o: in function `snd_acp5x_probe': pci-acp5x.c:(.text+0xbb): undefined reference to `snd_amd_acp_find_config' Fixes: e89f45ed ("ASoC: amd: vangogh: Add check for acp config flags in vangogh platform") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230605085839.2157268-1-arnd@kernel.org Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Liam R. Howlett authored
[ Upstream commit cfeb6ae8 ] The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution. During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value Alternatively: Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action. Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com Fixes: 54a611b6 ("Maple Tree: add new data structure") Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Mario Limonciello authored
[ Upstream commit c008323f ] Lenovo 82SJ doesn't have DMIC connected like 82V2 does. Narrow the match down to only cover 82V2. Reported-by:
<prosenfeld@Yuhsbstudents.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217063 Fixes: 2232b2dd ("ASoC: amd: yc: Add Lenovo Yoga Slim 7 Pro X to quirks table") Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230824011149.1395-1-mario.limonciello@amd.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bartosz Golaszewski authored
[ Upstream commit 6e39c1ac ] Associate the swnode of the GPIO device's (which is the interrupt controller here) with the irq domain. Otherwise the interrupt-controller device attribute is a no-op. Fixes: cb8c474e ("gpio: sim: new testing module") Signed-off-by:
Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bartosz Golaszewski authored
[ Upstream commit ab4109f9 ] If a GPIO simulator device is unbound with interrupts still requested, we will hit a use-after-free issue in __irq_domain_deactivate_irq(). The owner of the irq domain must dispose of all mappings before destroying the domain object. Fixes: cb8c474e ("gpio: sim: new testing module") Signed-off-by:
Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Rob Clark authored
[ Upstream commit e531fdb5 ] If a signal callback releases the sw_sync fence, that will trigger a deadlock as the timeline_fence_release recurses onto the fence->lock (used both for signaling and the the timeline tree). To avoid that, temporarily hold an extra reference to the signalled fences until after we drop the lock. (This is an alternative implementation of https://patchwork.kernel.org/patch/11664717/ which avoids some potential UAF issues with the original patch.) v2: Remove now obsolete comment, use list_move_tail() and list_del_init() Reported-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: d3c6dd1f ("dma-buf/sw_sync: Synchronize signal vs syncpt free") Signed-off-by:
Rob Clark <robdclark@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230818145939.39697-1-robdclark@gmail.com Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Biju Das authored
[ Upstream commit 8fcc1c40 ] The pinctrl group and function creation/remove calls expect caller to take care of locking. Add lock around these functions. Fixes: b59d0e78 ("pinctrl: Add RZ/A2 pin and gpio controller") Signed-off-by:
Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230815131558.33787-4-biju.das.jz@bp.renesas.com Signed-off-by:
Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Biju Das authored
[ Upstream commit f982b9d5 ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzv2m_dt_subnode_to_map() with mutex lock. Crash logs: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzv2m_dt_subnode_to_map+0x2e4/0x418 rzv2m_dt_node_to_map+0x15c/0x18c pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add a comment for lock. Fixes: 92a9b825 ("pinctrl: renesas: Add RZ/V2M pin and gpio controller driver") Signed-off-by:
Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230815131558.33787-3-biju.das.jz@bp.renesas.com Signed-off-by:
Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Biju Das authored
[ Upstream commit 661efa22 ] Fix the below random NULL pointer crash during boot by serializing pinctrl group and function creation/remove calls in rzg2l_dt_subnode_to_map() with mutex lock. Crash log: pc : __pi_strcmp+0x20/0x140 lr : pinmux_func_name_to_selector+0x68/0xa4 Call trace: __pi_strcmp+0x20/0x140 pinmux_generic_add_function+0x34/0xcc rzg2l_dt_subnode_to_map+0x314/0x44c rzg2l_dt_node_to_map+0x164/0x194 pinctrl_dt_to_map+0x218/0x37c create_pinctrl+0x70/0x3d8 While at it, add comments for bitmap_lock and lock. Fixes: c4c4637e ("pinctrl: renesas: Add RZ/G2L pin and gpio controller driver") Tested-by:
Chris Paterson <Chris.Paterson2@renesas.com> Signed-off-by:
Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230815131558.33787-2-biju.das.jz@bp.renesas.com Signed-off-by:
Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Biju Das authored
[ Upstream commit 2746f13f ] The COMMON_CLK config is not enabled in some of the architectures. This causes below build issues: pwm-rz-mtu3.c:(.text+0x114): undefined reference to `clk_rate_exclusive_put' pwm-rz-mtu3.c:(.text+0x32c): undefined reference to `clk_rate_exclusive_get' Fix these issues by moving clk_rate_exclusive_{get,put} inside COMMON_CLK code block, as clk.c is enabled by COMMON_CLK. Fixes: 55e9b8b7 ("clk: add clk_rate_exclusive api") Reported-by:
kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/all/202307251752.vLfmmhYm-lkp@intel.com/ Signed-off-by:
Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/20230725175140.361479-1-biju.das.jz@bp.renesas.com Signed-off-by:
Stephen Boyd <sboyd@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zhu Wang authored
commit 60c5fd2e upstream. The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by:
Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com Reviewed-by:
Bart Van Assche <bvanassche@acm.org> Fixes: 04b5b5cb ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zhu Wang authored
commit 1bd3a768 upstream. Commit 41320b18 ("scsi: snic: Fix possible memory leak if device_add() fails") fixed the memory leak caused by dev_set_name() when device_add() failed. However, it did not consider that 'tgt' has already been released when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path to avoid double free of 'tgt' and move put_device(&tgt->dev) after the removed kfree(tgt) to avoid a use-after-free. Fixes: 41320b18 ("scsi: snic: Fix possible memory leak if device_add() fails") Signed-off-by:
Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230819083941.164365-1-wangzhu9@huawei.com Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yin Fengwei authored
commit 0e0e9bd5 upstream. Commit 98b211d6 ("madvise: convert madvise_free_pte_range() to use a folio") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. It's not correct for large folios. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. It should be OK for madvise case here. User-visible effects is that the THP is skipped when user call madvise. But the correct behavior is THP should be split and processed then. NOTE: this change is a temporary fix to reduce the user-visible effects before the long term fix from David is ready. Link: https://lkml.kernel.org/r/20230808020917.2230692-4-fengwei.yin@intel.com Fixes: 98b211d6 ("madvise: convert madvise_free_pte_range() to use a folio") Signed-off-by:
Yin Fengwei <fengwei.yin@intel.com> Reviewed-by:
Yu Zhao <yuzhao@google.com> Reviewed-by:
Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oliver Hartkopp authored
commit c275a176 upstream. Commit ee8b94c8 ("can: raw: fix receiver memory leak") introduced a new reference to the CAN netdevice that has assigned CAN filters. But this new ro->dev reference did not maintain its own refcount which lead to another KASAN use-after-free splat found by Eric Dumazet. This patch ensures a proper refcount for the CAN nedevice. Fixes: ee8b94c8 ("can: raw: fix receiver memory leak") Reported-by:
Eric Dumazet <edumazet@google.com> Cc: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by:
Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20230821144547.6658-3-socketcan@hartkopp.net Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ming Lei authored
commit 9c7c4bc9 upstream. sizeof(struct ublksrv_io_cmd) is 16bytes, which can be held in 64byte SQE, so not necessary to check IO_URING_F_SQE128. With this change, we get chance to save half SQ ring memory. Fixed: 71f28f31 ("ublk_drv: add io_uring based userspace block driver") Signed-off-by:
Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230220041413.1524335-1-ming.lei@redhat.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sanjay R Mehta authored
commit 583893a6 upstream. Previously, on unplug events, the TMU mode was disabled first followed by the Time Synchronization Handshake, irrespective of whether the tb_switch_tmu_rate_write() API was successful or not. However, this caused a problem with Thunderbolt 3 (TBT3) devices, as the TSPacketInterval bits were always enabled by default, leading the host router to assume that the device router's TMU was already enabled and preventing it from initiating the Time Synchronization Handshake. As a result, TBT3 monitors experienced display flickering from the second hot plug onwards. To address this issue, we have modified the code to only disable the Time Synchronization Handshake during TMU disable if the tb_switch_tmu_rate_write() function is successful. This ensures that the TBT3 devices function correctly and eliminates the display flickering issue. Co-developed-by:
Sanath S <Sanath.S@amd.com> Signed-off-by:
Sanath S <Sanath.S@amd.com> Signed-off-by:
Sanjay R Mehta <sanju.mehta@amd.com> Cc: stable@vger.kernel.org Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> [ USB4v2 introduced support for uni-directional TMU mode as part of d49b4f04 ("thunderbolt: Add support for enhanced uni-directional TMU mode") This is not a stable candidate commit, so adjust the code for backport. ] Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dietmar Eggemann authored
commit 2ef269ef upstream. cpuset_can_attach() can fail. Postpone DL BW allocation until all tasks have been checked. DL BW is not allocated per-task but as a sum over all DL tasks migrating. If multiple controllers are attached to the cgroup next to the cpuset controller a non-cpuset can_attach() can fail. In this case free DL BW in cpuset_cancel_attach(). Finally, update cpuset DL task count (nr_deadline_tasks) only in cpuset_attach(). Suggested-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dietmar Eggemann authored
commit 85989106 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by:
Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Juri Lelli authored
commit c0f78fd5 upstream. update_tasks_root_domain currently iterates over all tasks even if no DEADLINE task is present on the cpuset/root domain for which bandwidth accounting is being rebuilt. This has been reported to introduce 10+ ms delays on suspend-resume operations. Skip the costly iteration for cpusets that don't contain DEADLINE tasks. Reported-by:
Qais Yousef (Google) <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Juri Lelli authored
commit 6c24849f upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by:
Qais Yousef (Google) <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Juri Lelli authored
commit 111cd11b upstream. Turns out percpu_cpuset_rwsem - commit 1243dc51 ("cgroup/cpuset: Convert cpuset_mutex to percpu_rwsem") - wasn't such a brilliant idea, as it has been reported to cause slowdowns in workloads that need to change cpuset configuration frequently and it is also not implementing priority inheritance (which causes troubles with realtime workloads). Convert percpu_cpuset_rwsem back to regular cpuset_mutex. Also grab it only for SCHED_DEADLINE tasks (other policies don't care about stable cpusets anyway). Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> [ Conflict in kernel/cgroup/cpuset.c due to pulling new code/comments. Reject all new code. Remove BUG_ON() about rwsem that doesn't exist on mainline. ] Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Juri Lelli authored
commit ad3a557d upstream. rebuild_root_domains() and update_tasks_root_domain() have neutral names, but actually deal with DEADLINE bandwidth accounting. Rename them to use 'dl_' prefix so that intent is more clear. No functional change. Suggested-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Juri Lelli <juri.lelli@redhat.com> Reviewed-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Brauner authored
commit 2d8ae8c4 upstream. We've aligned setgid behavior over multiple kernel releases. The details can be found in commit cf619f89 ("Merge tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping") and commit 426b4ca2 ("Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux" ). Consistent setgid stripping behavior is now encapsulated in the setattr_should_drop_sgid() helper which is used by all filesystems that strip setgid bits outside of vfs proper. Usually ATTR_KILL_SGID is raised in e.g., chown_common() and is subject to the setattr_should_drop_sgid() check to determine whether the setgid bit can be retained. Since nfsd is raising ATTR_KILL_SGID unconditionally it will cause notify_change() to strip it even if the caller had the necessary privileges to retain it. Ensure that nfsd only raises ATR_KILL_SGID if the caller lacks the necessary privileges to retain the setgid bit. Without this patch the setgid stripping tests in LTP will fail: > As you can see, the problem is S_ISGID (0002000) was dropped on a > non-group-executable file while chown was invoked by super-user, while [...] > fchown02.c:66: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 [...] > chown02.c:57: TFAIL: testfile2: wrong mode permissions 0100700, expected 0102700 With this patch all tests pass. Reported-by:
Sherry Yang <sherry.yang@oracle.com> Signed-off-by:
Christian Brauner <brauner@kernel.org> Reviewed-by:
Jeff Layton <jlayton@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> [Harshit: backport to 6.1.y: Use init_user_ns instead of nop_mnt_idmap as we don't have commit abf08576 ("fs: port vfs_*() helpers to struct mnt_idmap")] Signed-off-by:
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Brauner authored
commit 4f704d9a upstream. We've aligned setgid behavior over multiple kernel releases. The details can be found in the following two merge messages: cf619f89 ("Merge tag 'fs.ovl.setgid.v6.2') 426b4ca2 ("Merge tag 'fs.setgid.v6.0') Consistent setgid stripping behavior is now encapsulated in the setattr_should_drop_sgid() helper which is used by all filesystems that strip setgid bits outside of vfs proper. Switch nfs to rely on this helper as well. Without this patch the setgid stripping tests in xfstests will fail. Signed-off-by:
Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Message-Id: <20230313-fs-nfs-setgid-v2-1-9a59f436cfc0@kernel.org> Signed-off-by:
Christian Brauner <brauner@kernel.org> [ Harshit: backport to 6.1.y: fs/internal.h -- minor conflict due to code change differences. include/linux/fs.h -- Used struct user_namespace *mnt_userns instead of struct mnt_idmap *idmap fs/nfs/inode.c -- Used init_user_ns instead of nop_mnt_idmap ] Signed-off-by:
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hangbin Liu authored
commit 3c107f36 upstream. There are some issues with the bpf/nat6to4.c building. 1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to kselftest-list file and run by common run_tests. 2. When building the test via `make -C tools/testing/selftests/ TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/ folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o. The correct path should be ./bpf/nat6to4.o. 3. If building the test via `make -C tools/testing/selftests/ TARGETS="net" install`. The nat6to4.o will be installed to kselftest_install/net/ folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o. To fix the confusing test path, let's just move the nat6to4.c to net folder and build it as TEST_GEN_FILES. Fixes: edae34a3 ("selftests net: add UDP GRO fraglist + bpf self-tests") Tested-by:
Björn Töpel <bjorn@kernel.org> Signed-off-by:
Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20230118020927.3971864-1-liuhangbin@gmail.com Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Hardik Garg <hargar@linux.microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aleksa Savic authored
commit 56b930dc upstream. Add a 200ms delay after sending a ctrl report to Quadro, Octo, D5 Next and Aquaero to give them enough time to process the request and save the data to memory. Otherwise, under heavier userspace loads where multiple sysfs entries are usually set in quick succession, a new ctrl report could be requested from the device while it's still processing the previous one and fail with -EPIPE. The delay is only applied if two ctrl report operations are near each other in time. Reported by a user on Github [1] and tested by both of us. [1] https://github.com/aleksamagicka/aquacomputer_d5next-hwmon/issues/82 Fixes: 752b9279 ("hwmon: (aquacomputer_d5next) Add support for Aquacomputer Octo") Signed-off-by:
Aleksa Savic <savicaleksa83@gmail.com> Link: https://lore.kernel.org/r/20230807172004.456968-1-savicaleksa83@gmail.com Signed-off-by:
Guenter Roeck <linux@roeck-us.net> [ removed Aquaero support as it's not in 6.1 ] Signed-off-by:
Aleksa Savic <savicaleksa83@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Feng Tang authored
commit 2c66ca39 upstream. 0-Day found a 34.6% regression in stress-ng's 'af-alg' test case, and bisected it to commit b81fac90 ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()"), which optimizes the FPU init order, and moves the CR4_OSXSAVE enabling into a later place: arch_cpu_finalize_init identify_boot_cpu identify_cpu generic_identify get_cpu_cap --> setup cpu capability ... fpu__init_cpu fpu__init_cpu_xstate cr4_set_bits(X86_CR4_OSXSAVE); As the FPU is not yet initialized the CPU capability setup fails to set X86_FEATURE_OSXSAVE. Many security module like 'camellia_aesni_avx_x86_64' depend on this feature and therefore fail to load, causing the regression. Cure this by setting X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling. [ tglx: Moved it into the actual BSP FPU initialization code and added a comment ] Fixes: b81fac90 ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()") Reported-by:
kernel test robot <oliver.sang@intel.com> Signed-off-by:
Feng Tang <feng.tang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com Link: https://lore.kernel.org/lkml/20230823065747.92257-1-feng.tang@intel.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rick Edgecombe authored
commit 1f69383b upstream. The thread flag TIF_NEED_FPU_LOAD indicates that the FPU saved state is valid and should be reloaded when returning to userspace. However, the kernel will skip doing this if the FPU registers are already valid as determined by fpregs_state_valid(). The logic embedded there considers the state valid if two cases are both true: 1: fpu_fpregs_owner_ctx points to the current tasks FPU state 2: the last CPU the registers were live in was the current CPU. This is usually correct logic. A CPU’s fpu_fpregs_owner_ctx is set to the current FPU during the fpregs_restore_userregs() operation, so it indicates that the registers have been restored on this CPU. But this alone doesn’t preclude that the task hasn’t been rescheduled to a different CPU, where the registers were modified, and then back to the current CPU. To verify that this was not the case the logic relies on the second condition. So the assumption is that if the registers have been restored, AND they haven’t had the chance to be modified (by being loaded on another CPU), then they MUST be valid on the current CPU. Besides the lazy FPU optimizations, the other cases where the FPU registers might not be valid are when the kernel modifies the FPU register state or the FPU saved buffer. In this case the operation modifying the FPU state needs to let the kernel know the correspondence has been broken. The comment in “arch/x86/kernel/fpu/context.h” has: /* ... * If the FPU register state is valid, the kernel can skip restoring the * FPU state from memory. * * Any code that clobbers the FPU registers or updates the in-memory * FPU state for a task MUST let the rest of the kernel know that the * FPU registers are no longer valid for this task. * * Either one of these invalidation functions is enough. Invalidate * a resource you control: CPU if using the CPU for something else * (with preemption disabled), FPU for the current task, or a task that * is prevented from running by the current task. */ However, this is not completely true. When the kernel modifies the registers or saved FPU state, it can only rely on __fpu_invalidate_fpregs_state(), which wipes the FPU’s last_cpu tracking. The exec path instead relies on fpregs_deactivate(), which sets the CPU’s FPU context to NULL. This was observed to fail to restore the reset FPU state to the registers when returning to userspace in the following scenario: 1. A task is executing in userspace on CPU0 - CPU0’s FPU context points to tasks - fpu->last_cpu=CPU0 2. The task exec()’s 3. While in the kernel the task is preempted - CPU0 gets a thread executing in the kernel (such that no other FPU context is activated) - Scheduler sets task’s fpu->last_cpu=CPU0 when scheduling out 4. Task is migrated to CPU1 5. Continuing the exec(), the task gets to fpu_flush_thread()->fpu_reset_fpregs() - Sets CPU1’s fpu context to NULL - Copies the init state to the task’s FPU buffer - Sets TIF_NEED_FPU_LOAD on the task 6. The task reschedules back to CPU0 before completing the exec() and returning to userspace - During the reschedule, scheduler finds TIF_NEED_FPU_LOAD is set - Skips saving the registers and updating task’s fpu→last_cpu, because TIF_NEED_FPU_LOAD is the canonical source. 7. Now CPU0’s FPU context is still pointing to the task’s, and fpu->last_cpu is still CPU0. So fpregs_state_valid() returns true even though the reset FPU state has not been restored. So the root cause is that exec() is doing the wrong kind of invalidate. It should reset fpu->last_cpu via __fpu_invalidate_fpregs_state(). Further, fpu__drop() doesn't really seem appropriate as the task (and FPU) are not going away, they are just getting reset as part of an exec. So switch to __fpu_invalidate_fpregs_state(). Also, delete the misleading comment that says that either kind of invalidate will be enough, because it’s not always the case. Fixes: 33344368 ("x86/fpu: Clean up the fpu__clear() variants") Reported-by:
Lei Wang <lei4.wang@intel.com> Signed-off-by:
Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Lijun Pan <lijun.pan@intel.com> Reviewed-by:
Sohil Mehta <sohil.mehta@intel.com> Acked-by:
Lijun Pan <lijun.pan@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230818170305.502891-1-rick.p.edgecombe@intel.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ankit Nautiyal authored
commit 5ad1ab30 upstream. DP DSC Receiver Capabilities are exposed via DPCD 60h-6Fh. Fix the DSC RECEIVER CAP SIZE accordingly. Fixes: ffddc436 ("drm/dp: Add DP DSC DPCD receiver capability size define and missing SHIFT") Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: <stable@vger.kernel.org> # v5.0+ Signed-off-by:
Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by:
Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230818044436.177806-1-ankit.k.nautiyal@intel.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anshuman Gupta authored
commit 2872144a upstream. System wide suspend already has support for lmem save/restore during suspend therefore enabling d3cold for s2idle and keepng it disable for runtime PM.(Refer below commit for d3cold runtime PM disable justification) 'commit 66eb93e7 ("drm/i915/dgfx: Keep PCI autosuspend control 'on' by default on all dGPU")' It will reduce the DG2 Card power consumption to ~0 Watt for s2idle power KPI. v2: - Added "Cc: stable@vger.kernel.org". Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8755 Cc: stable@vger.kernel.org Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Tested-by:
Aaron Ma <aaron.ma@canonical.com> Tested-by:
Jianshui Yu <Jianshui.yu@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230816125216.1722002-1-anshuman.gupta@intel.com (cherry picked from commit 2643e6d1 ) Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-