- Aug 11, 2023
-
-
Yang Yingliang authored
[ Upstream commit 75ae8c28 ] dev_err() can be replace with dev_err_probe() which will check if error code is -EPROBE_DEFER. Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Stable-dep-of: ef45e840 ("net: ll_temac: fix error checking of irq_of_parse_and_map()") Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Tomas Glozar authored
[ Upstream commit 13d2618b ] Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC allocation later in sk_psock_init_link on PREEMPT_RT kernels, since GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist patchset notes for details). This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to BUG (sleeping function called from invalid context) on RT kernels. preempt_disable was introduced together with lock_sk and rcu_read_lock in commit 99ba2b5a ("bpf: sockhash, disallow bpf_tcp_close and update in parallel"), probably to match disabled migration of BPF programs, and is no longer necessary. Remove preempt_disable to fix BUG in sock_map_update_common on RT. Signed-off-by:
Tomas Glozar <tglozar@redhat.com> Reviewed-by:
Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/all/20200224140131.461979697@linutronix.de/ Fixes: 99ba2b5a ("bpf: sockhash, disallow bpf_tcp_close and update in parallel") Reviewed-by:
John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230728064411.305576-1-tglozar@redhat.com Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
valis authored
[ Upstream commit b80b829e ] When route4_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: 1109c005 ("net: sched: RCU cls_route") Reported-by:
valis <sec@valis.email> Reported-by:
Bing-Jhong Billy Jheng <billy@starlabs.sg> Signed-off-by:
valis <sec@valis.email> Signed-off-by:
Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by:
Victor Nogueira <victor@mojatatu.com> Reviewed-by:
Pedro Tammela <pctammela@mojatatu.com> Reviewed-by:
M A Ramdhan <ramdhan@starlabs.sg> Link: https://lore.kernel.org/r/20230729123202.72406-4-jhs@mojatatu.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
valis authored
[ Upstream commit 76e42ae8 ] When fw_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: e35a8ee5 ("net: sched: fw use RCU") Reported-by:
valis <sec@valis.email> Reported-by:
Bing-Jhong Billy Jheng <billy@starlabs.sg> Signed-off-by:
valis <sec@valis.email> Signed-off-by:
Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by:
Victor Nogueira <victor@mojatatu.com> Reviewed-by:
Pedro Tammela <pctammela@mojatatu.com> Reviewed-by:
M A Ramdhan <ramdhan@starlabs.sg> Link: https://lore.kernel.org/r/20230729123202.72406-3-jhs@mojatatu.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
valis authored
[ Upstream commit 3044b16e ] When u32_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: de5df632 ("net: sched: cls_u32 changes to knode must appear atomic to readers") Reported-by:
valis <sec@valis.email> Reported-by:
M A Ramdhan <ramdhan@starlabs.sg> Signed-off-by:
valis <sec@valis.email> Signed-off-by:
Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by:
Victor Nogueira <victor@mojatatu.com> Reviewed-by:
Pedro Tammela <pctammela@mojatatu.com> Reviewed-by:
M A Ramdhan <ramdhan@starlabs.sg> Link: https://lore.kernel.org/r/20230729123202.72406-2-jhs@mojatatu.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit e5f0d2dd ] In a prior commit I forgot that sk_getsockopt() reads sk->sk_ll_usec without holding a lock. Fixes: 0dbffbb5 ("net: annotate data race around sk_ll_usec") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit 11695c6e ] sk_getsockopt() runs locklessly, thus we need to annotate the read of sk->sk_peek_off. While we are at it, add corresponding annotations to sk_set_peek_off() and unix_set_peek_off(). Fixes: b9bb53f3 ("sock: convert sk_peek_offset functions to WRITE_ONCE") Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit b4b55325 ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_rcvbuf locklessly. Fixes: ebb3b78d ("tcp: annotate sk->sk_rcvbuf lockless reads") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit 74bc0843 ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_sndbuf locklessly. Fixes: e292f05e ("tcp: annotate sk->sk_sndbuf lockless reads") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit e6d12bdb ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_rcvlowat locklessly. Fixes: eac66402 ("net: annotate sk->sk_rcvlowat lockless reads") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Dumazet authored
[ Upstream commit ea7f45ef ] sk_getsockopt() runs locklessly. This means sk->sk_max_pacing_rate can be read while other threads are changing its value. Fixes: 62748f32 ("net: introduce SO_MAX_PACING_RATE") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Chengfeng Ye authored
[ Upstream commit 56c6be35 ] As &hc->lock is acquired by both timer _hfcpci_softirq() and hardirq hfcpci_int(), the timer should disable irq before lock acquisition otherwise deadlock could happen if the timmer is preemtped by the hadr irq. Possible deadlock scenario: hfcpci_softirq() (timer) -> _hfcpci_softirq() -> spin_lock(&hc->lock); <irq interruption> -> hfcpci_int() -> spin_lock(&hc->lock); (deadlock here) This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. The tentative patch fixes the potential deadlock by spin_lock_irq() in timer. Fixes: b36b654a ("mISDN: Create /sys/class/mISDN") Signed-off-by:
Chengfeng Ye <dg573847474@gmail.com> Link: https://lore.kernel.org/r/20230727085619.7419-1-dg573847474@gmail.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jamal Hadi Salim authored
[ Upstream commit e68409db ] A match entry is uniquely identified with an "address" or "path" in the form of: hashtable ID(12b):bucketid(8b):nodeid(12b). When creating table match entries all of hash table id, bucket id and node (match entry id) are needed to be either specified by the user or reasonable in-kernel defaults are used. The in-kernel default for a table id is 0x800(omnipresent root table); for bucketid it is 0x0. Prior to this fix there was none for a nodeid i.e. the code assumed that the user passed the correct nodeid and if the user passes a nodeid of 0 (as Mingi Cho did) then that is what was used. But nodeid of 0 is reserved for identifying the table. This is not a problem until we dump. The dump code notices that the nodeid is zero and assumes it is referencing a table and therefore references table struct tc_u_hnode instead of what was created i.e match entry struct tc_u_knode. Ming does an equivalent of: tc filter add dev dummy0 parent 10: prio 1 handle 0x1000 \ protocol ip u32 match ip src 10.0.0.1/32 classid 10:1 action ok Essentially specifying a table id 0, bucketid 1 and nodeid of zero Tableid 0 is remapped to the default of 0x800. Bucketid 1 is ignored and defaults to 0x00. Nodeid was assumed to be what Ming passed - 0x000 dumping before fix shows: ~$ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor -30591 Note that the last line reports a table instead of a match entry (you can tell this because it says "ht divisor..."). As a result of reporting the wrong data type (misinterpretting of struct tc_u_knode as being struct tc_u_hnode) the divisor is reported with value of -30591. Ming identified this as part of the heap address (physmap_base is 0xffff8880 (-30591 - 1)). The fix is to ensure that when table entry matches are added and no nodeid is specified (i.e nodeid == 0) then we get the next available nodeid from the table's pool. After the fix, this is what the dump shows: $ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:1 not_in_hw match 0a000001/ffffffff at 12 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 Reported-by:
Mingi Cho <mgcho.minic@gmail.com> Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20230726135151.416917-1-jhs@mojatatu.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Georg Müller authored
[ Upstream commit 98ce8e4a ] Without gcc, the test will fail. On cleanup, ignore probe removal errors. Otherwise, in case of an error adding the probe, the temporary directory is not removed. Fixes: 56cbeacf ("perf probe: Add test for regression introduced by switch to die_get_decl_file()") Signed-off-by:
Georg Müller <georgmueller@gmx.net> Acked-by:
Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Georg Müller <georgmueller@gmx.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230728151812.454806-2-georgmueller@gmx.net Link: https://lore.kernel.org/r/CAP-5=fUP6UuLgRty3t2=fQsQi3k4hDMz415vWdp1x88QMvZ8ug@mail.gmail.com/ Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yuanjun Gong authored
[ Upstream commit dadc5b86 ] in bcm_sf2_sw_probe(), check the return value of clk_prepare_enable() and return the error code if clk_prepare_enable() returns an unexpected value. Fixes: e9ec5c3b ("net: dsa: bcm_sf2: request and handle clocks") Signed-off-by:
Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by:
Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20230726170506.16547-1-ruc_gongyuanjun@163.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lin Ma authored
[ Upstream commit d73ef2d6 ] There are totally 9 ndo_bridge_setlink handlers in the current kernel, which are 1) bnxt_bridge_setlink, 2) be_ndo_bridge_setlink 3) i40e_ndo_bridge_setlink 4) ice_bridge_setlink 5) ixgbe_ndo_bridge_setlink 6) mlx5e_bridge_setlink 7) nfp_net_bridge_setlink 8) qeth_l2_bridge_setlink 9) br_setlink. By investigating the code, we find that 1-7 parse and use nlattr IFLA_BRIDGE_MODE but 3 and 4 forget to do the nla_len check. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 2 byte integer. To avoid such issues, also for other ndo_bridge_setlink handlers in the future. This patch adds the nla_len check in rtnl_bridge_setlink and does an early error return if length mismatches. To make it works, the break is removed from the parsing for IFLA_BRIDGE_FLAGS to make sure this nla_for_each_nested iterates every attribute. Fixes: b1edc14a ("ice: Implement ice_bridge_getlink and ice_bridge_setlink") Fixes: 51616018 ("i40e: Add support for getlink, setlink ndo ops") Suggested-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Lin Ma <linma@zju.edu.cn> Acked-by:
Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by:
Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20230726075314.1059224-1-linma@zju.edu.cn Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Lin Ma authored
[ Upstream commit bcc29b7f ] The nla_for_each_nested parsing in function bpf_sk_storage_diag_alloc does not check the length of the nested attribute. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 4 byte integer. This patch adds an additional check when the nlattr is getting counted. This makes sure the latter nla_get_u32 can access the attributes with the correct length. Fixes: 1ed4d924 ("bpf: INET_DIAG support in bpf_sk_storage") Suggested-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Lin Ma <linma@zju.edu.cn> Reviewed-by:
Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230725023330.422856-1-linma@zju.edu.cn Signed-off-by:
Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Yuanjun Gong authored
[ Upstream commit e5bcb756 ] mlx5e_ipsec_remove_trailer() should return an error code if function pskb_trim() returns an unexpected value. Fixes: 2ac9cfe7 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by:
Yuanjun Gong <ruc_gongyuanjun@163.com> Reviewed-by:
Leon Romanovsky <leonro@nvidia.com> Signed-off-by:
Saeed Mahameed <saeedm@nvidia.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zhengchao Shao authored
[ Upstream commit 5dd77585 ] when mlx5_cmd_exec failed in mlx5dr_cmd_create_reformat_ctx, the memory pointed by 'in' is not released, which will cause memory leak. Move memory release after mlx5_cmd_exec. Fixes: 1d918647 ("net/mlx5: DR, Add direct rule command utilities") Signed-off-by:
Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by:
Leon Romanovsky <leonro@nvidia.com> Signed-off-by:
Saeed Mahameed <saeedm@nvidia.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Ilan Peer authored
[ Upstream commit fd7f08d9 ] The reporter noticed a warning when running iwlwifi: WARNING: CPU: 8 PID: 659 at mm/page_alloc.c:4453 __alloc_pages+0x329/0x340 As cfg80211_parse_colocated_ap() is not expected to return a negative value return 0 and not a negative value if cfg80211_calc_short_ssid() fails. Fixes: c8cb5b85 ("nl80211/cfg80211: support 6 GHz scanning") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217675 Signed-off-by:
Ilan Peer <ilan.peer@intel.com> Signed-off-by:
Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230723201043.3007430-1-ilan.peer@intel.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Heiko Carstens authored
[ Upstream commit 0c02cc57 ] Commit 9fb6c9b3 ("s390/sthyi: add cache to store hypervisor info") added cache handling for store hypervisor info. This also changed the possible return code for sthyi_fill(). Instead of only returning a condition code like the sthyi instruction would do, it can now also return a negative error value (-ENOMEM). handle_styhi() was not changed accordingly. In case of an error, the negative error value would incorrectly injected into the guest PSW. Add proper error handling to prevent this, and update the comment which describes the possible return values of sthyi_fill(). Fixes: 9fb6c9b3 ("s390/sthyi: add cache to store hypervisor info") Reviewed-by:
Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20230727182939.2050744-1-hca@linux.ibm.com Signed-off-by:
Heiko Carstens <hca@linux.ibm.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
ndesaulniers@google.com authored
[ Upstream commit 79e8328e ] Compiling big-endian targets with Clang produces the diagnostic: fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical] } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants))); ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || fs/namei.c:2173:13: note: cast one or both operands to int to silence this warning It appears that when has_zero was introduced, two definitions were produced with different signatures (in particular different return types). Looking at the usage in hash_name() in fs/namei.c, I suspect that has_zero() is meant to be invoked twice per while loop iteration; using logical-or would not update `bdata` when `a` did not have zeros. So I think it's preferred to always return an unsigned long rather than a bool than update the while loop in hash_name() to use a logical-or rather than bitwise-or. [ Also changed powerpc version to do the same - Linus ] Link: https://github.com/ClangBuiltLinux/linux/issues/1832 Link: https://lore.kernel.org/lkml/20230801-bitwise-v1-1-799bec468dc4@google.com/ Fixes: 36126f8f ("word-at-a-time: make the interfaces truly generic") Debugged-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Nick Desaulniers <ndesaulniers@google.com> Acked-by:
Heiko Carstens <hca@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Hugo Villeneuve authored
[ Upstream commit 253be5b5 ] For SOMs with an onboard PHY, the RESET_N pull-up resistor is currently deactivated in the pinmux configuration. When the pinmux code selects the GPIO function for this pin, with a default direction of input, this prevents the RESET_N pin from being taken to the proper 3.3V level (deasserted), and this results in the PHY being not detected since it is held in reset. Taken from RESET_N pin description in ADIN13000 datasheet: This pin requires a 1K pull-up resistor to AVDD_3P3. Activate the pull-up resistor to fix the issue. Fixes: ade0176d ("arm64: dts: imx8mn-var-som: Add Variscite VAR-SOM-MX8MN System on Module") Signed-off-by:
Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by:
Fabio Estevam <festevam@gmail.com> Signed-off-by:
Shawn Guo <shawnguo@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Bart Van Assche authored
commit 2112f5c1 upstream. We noticed that the user interface of Android devices becomes very slow under memory pressure. This is because Android uses the zram driver on top of the loop driver for swapping, because under memory pressure the swap code alternates reads and writes quickly, because mq-deadline is the default scheduler for loop devices and because mq-deadline delays writes by five seconds for such a workload with default settings. Fix this by making the kernel select I/O scheduler 'none' from inside add_disk() for loop devices. This default can be overridden at any time from user space, e.g. via a udev rule. This approach has an advantage compared to changing the I/O scheduler from userspace from 'mq-deadline' into 'none', namely that synchronize_rcu() does not get called. This patch changes the default I/O scheduler for loop devices from 'mq-deadline' into 'none'. Additionally, this patch reduces the Android boot time on my test setup with 0.5 seconds compared to configuring the loop I/O scheduler from user space. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Martijn Coenen <maco@android.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210805174200.3250718-3-bvanassche@acm.org Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 1af6239d upstream. With the advent of CFI it is no longer acceptible to cast function pointers. The robot complains thusly: kernel-events-core.c:warning:cast-from-int-(-)(struct-perf_cpu_pmu_context-)-to-remote_function_f-(aka-int-(-)(void-)-)-converts-to-incompatible-function-type Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Cixi Geng <cixi.geng1@unisoc.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jens Axboe authored
Commit 7b72d661 upstream. A previous commit made all cqring waits marked as iowait, as a way to improve performance for short schedules with pending IO. However, for use cases that have a special reaper thread that does nothing but wait on events on the ring, this causes a cosmetic issue where we know have one core marked as being "busy" with 100% iowait. While this isn't a grave issue, it is confusing to users. Rather than always mark us as being in iowait, gate setting of current->in_iowait to 1 by whether or not the waiting task has pending requests. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+UgLXsv1Xg@mail.gmail.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=217699 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217700 Reported-by:
Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by:
Phil Elwell <phil@raspberrypi.com> Tested-by:
Andres Freund <andres@anarazel.de> Fixes: 8a796565 ("io_uring: Use io_schedule* in cqring wait") Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nadav Amit authored
[ Upstream commit 8924779d ] When kprobes emulates JNG/JNLE instructions on x86 it uses the wrong condition. For JNG (opcode: 0F 8E), according to Intel SDM, the jump is performed if (ZF == 1 or SF != OF). However the kernel emulation currently uses 'and' instead of 'or'. As a result, setting a kprobe on JNG/JNLE might cause the kernel to behave incorrectly whenever the kprobe is hit. Fix by changing the 'and' to 'or'. Fixes: 6256e668 ("x86/kprobes: Use int3 instead of debug trap for single-step") Signed-off-by:
Nadav Amit <namit@vmware.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220813225943.143767-1-namit@vmware.com Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu (Google) authored
[ Upstream commit dec8784c ] Fix kprobes to update kcb (kprobes control block) status flag to KPROBE_HIT_SSDONE even if the kp->post_handler is not set. This bug may cause a kernel panic if another INT3 user runs right after kprobes because kprobe_int3_handler() misunderstands the INT3 is kprobe's single stepping INT3. Fixes: 6256e668 ("x86/kprobes: Use int3 instead of debug trap for single-step") Reported-by:
Daniel Müller <deso@posteo.net> Signed-off-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Tested-by:
Daniel Müller <deso@posteo.net> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20220727210136.jjgc3lpqeq42yr3m@muellerd-fedora-PC2BDTX9 Link: https://lore.kernel.org/r/165942025658.342061.12452378391879093249.stgit@devnote2 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wei Yongjun authored
[ Upstream commit 2304d14d ] Address this GCC warning: arch/x86/kernel/kprobes/core.c:940:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration] 940 | static int nokprobe_inline kprobe_is_ss(struct kprobe_ctlblk *kcb) | ^~~~~~ [ mingo: Tidied up the changelog. ] Fixes: 6256e668 : ("x86/kprobes: Use int3 instead of debug trap for single-step") Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Acked-by:
Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20210324144502.1154883-1-weiyongjun1@huawei.com Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
[ Upstream commit 2f706e0e ] Fix can_boost() to identify indirect jmp and others using range case correctly. Since the condition in switch statement is opcode & 0xf0, it can not evaluate to 0xff case. This should be under the 0xf0 case. However, there is no reason to use the conbinations of the bit-masked condition and lower bit checking. Use range case to clean up the switch statement too. Fixes: 6256e668 ("x86/kprobes: Use int3 instead of debug trap for single-step") Reported-by:
Colin Ian King <colin.king@canonical.com> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/161666692308.1120877.4675552834049546493.stgit@devnote2 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
[ Upstream commit 6256e668 ] Use int3 instead of debug trap exception for single-stepping the probed instructions. Some instructions which change the ip registers or modify IF flags are emulated because those are not able to be single-stepped by int3 or may allow the interrupt while single-stepping. This actually changes the kprobes behavior. - kprobes can not probe following instructions; int3, iret, far jmp/call which get absolute address as immediate, indirect far jmp/call, indirect near jmp/call with addressing by memory (register-based indirect jmp/call are OK), and vmcall/vmlaunch/vmresume/vmxoff. - If the kprobe post_handler doesn't set before registering, it may not be called in some case even if you set it afterwards. (IOW, kprobe booster is enabled at registration, user can not change it) But both are rare issue, unsupported instructions will not be used in the kernel (or rarely used), and post_handlers are rarely used (I don't see it except for the test code). Suggested-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/161469874601.49483.11985325887166921076.stgit@devnote2 [Huafei: Fix trivial conflict in arch/x86/kernel/kprobes/core.c due to the previously backported commit 6dd3b8c9 ] Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
[ Upstream commit a194acd3 ] Since Grp5 far indirect JMP is FF "mod 101 r/m", it should be (modrm & 0x38) == 0x28, and near indirect JMP is also 0x38 == 0x20. So we can mask modrm with 0x30 and check 0x20. This is actually what the original code does, it also doesn't care the last bit. So the result code is same. Thus, I think this is just a cosmetic cleanup. Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/161469873475.49483.13257083019966335137.stgit@devnote2 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
[ Upstream commit d60ad3d4 ] Since the opcodes start from 0xff are group5 instruction group which is not 2 bytes opcode but the extended opcode determined by the MOD/RM byte. The commit abd82e53 ("x86/kprobes: Do not decode opcode in resume_execution()") used insn->opcode.bytes[1], but that is not correct. We have to refer the insn->modrm.bytes[1] instead. Fixes: abd82e53 ("x86/kprobes: Do not decode opcode in resume_execution()") Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/161469872400.49483.18214724458034233166.stgit@devnote2 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
[ Upstream commit abd82e53 ] Currently, kprobes decodes the opcode right after single-stepping in resume_execution(). But the opcode was already decoded while preparing arch_specific_insn in arch_copy_kprobe(). Decode the opcode in arch_copy_kprobe() instead of in resume_execution() and set some flags which classify the opcode for the resuming process. [ bp: Massage commit message. ] Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/160830072561.349576.3014979564448023213.stgit@devnote2 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
[ Upstream commit e689b300 ] In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Signed-off-by:
Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115 Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Petazzoni authored
commit e51df4f8 upstream. In commit 2cb1e025 ("ASoC: cs42l51: re-hook of_match_table pointer"), 9 years ago, some random guy fixed the cs42l51 after it was split into a core part and an I2C part to properly match based on a Device Tree compatible string. However, the fix in this commit is wrong: the MODULE_DEVICE_TABLE(of, ....) is in the core part of the driver, not the I2C part. Therefore, automatic module loading based on module.alias, based on matching with the DT compatible string, loads the core part of the driver, but not the I2C part. And threfore, the i2c_driver is not registered, and the codec is not known to the system, nor matched with a DT node with the corresponding compatible string. In order to fix that, we move the MODULE_DEVICE_TABLE(of, ...) into the I2C part of the driver. The cs42l51_of_match[] array is also moved as well, as it is not possible to have this definition in one file, and the MODULE_DEVICE_TABLE(of, ...) invocation in another file, due to how MODULE_DEVICE_TABLE works. Thanks to this commit, the I2C part of the driver now properly autoloads, and thanks to its dependency on the core part, the core part gets autoloaded as well, resulting in a functional sound card without having to manually load kernel modules. Fixes: 2cb1e025 ("ASoC: cs42l51: re-hook of_match_table pointer") Cc: stable@vger.kernel.org Signed-off-by:
Thomas Petazzoni <thomas.petazzoni@bootlin.com> Link: https://lore.kernel.org/r/20230713112112.778576-1-thomas.petazzoni@bootlin.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jens Axboe authored
commit a9be2022 upstream. io-wq assumes that an issue is blocking, but it may not be if the request type has asked for a non-blocking attempt. If we get -EAGAIN for that case, then we need to treat it as a final result and not retry or arm poll for it. Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/issues/897 Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matthieu Baerts authored
commit 6c8880fc upstream. MPTCP selftests are using TCP SYN Cookies for quite a while now, since v5.9. Some CIs don't have this config option enabled and this is causing issues in the tests: # ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 167ms) sysctl: cannot stat /proc/sys/net/ipv4/tcp_syncookies: No such file or directory # [ OK ]./mptcp_connect.sh: line 554: [: -eq: unary operator expected There is no impact in the results but the test is not doing what it is supposed to do. Fixes: fed61c4b ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally") Cc: stable@vger.kernel.org Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit e8a0e30b upstream. After making acpi_processor_get_platform_limit() use the "no limit" value for its frequency QoS request when _PPC returns 0, it is not necessary to replace the frequency corresponding to the first _PSS return package entry with the maximum turbo frequency of the given CPU in intel_pstate_init_acpi_perf_limits() any more, so drop the code doing that along with the comment explaining it. Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by:
Hagar Hemdan <hagarhem@amazon.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit 99387b01 upstream. Modify acpi_processor_get_platform_limit() to avoid updating its frequency QoS request when the _PPC return value has not changed by comparing that value to the previous _PPC return value stored in the performance_platform_limit field of the struct acpi_processor corresponding to the given CPU. While at it, do the _PPC return value check against the state count earlier, to avoid setting performance_platform_limit to an invalid value, and make acpi_processor_ppc_init() use FREQ_QOS_MAX_DEFAULT_VALUE as the "no limit" frequency QoS for consistency. Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by:
Hagar Hemdan <hagarhem@amazon.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-