Skip to content
  1. Sep 15, 2022
  2. Sep 05, 2022
    • Greg Kroah-Hartman's avatar
      Linux 4.19.257 · 41b46409
      Greg Kroah-Hartman authored
      
      
      Link: https://lore.kernel.org/r/20220902121400.219861128@linuxfoundation.org
      Tested-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Tested-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
      Tested-by: default avatarLinux Kernel Functional Testing <lkft@linaro.org>
      Tested-by: default avatarPavel Machek (CIP) <pavel@denx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      v4.19.257
      41b46409
    • Yang Yingliang's avatar
      net: neigh: don't call kfree_skb() under spin_lock_irqsave() · 509e1b32
      Yang Yingliang authored
      commit d5485d9d upstream.
      
      It is not allowed to call kfree_skb() from hardware interrupt
      context or with interrupts being disabled. So add all skb to
      a tmp list, then free them after spin_unlock_irqrestore() at
      once.
      
      Fixes: 66ba215c
      
       ("neigh: fix possible DoS due to net iface start/stop loop")
      Suggested-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      509e1b32
    • Kuniyuki Iwashima's avatar
      kprobes: don't call disarm_kprobe() for disabled kprobes · fc91d2db
      Kuniyuki Iwashima authored
      commit 9c80e799 upstream.
      
      The assumption in __disable_kprobe() is wrong, and it could try to disarm
      an already disarmed kprobe and fire the WARN_ONCE() below. [0]  We can
      easily reproduce this issue.
      
      1. Write 0 to /sys/kernel/debug/kprobes/enabled.
      
        # echo 0 > /sys/kernel/debug/kprobes/enabled
      
      2. Run execsnoop.  At this time, one kprobe is disabled.
      
        # /usr/share/bcc/tools/execsnoop &
        [1] 2460
        PCOMM            PID    PPID   RET ARGS
      
        # cat /sys/kernel/debug/kprobes/list
        ffffffff91345650  r  __x64_sys_execve+0x0    [FTRACE]
        ffffffff91345650  k  __x64_sys_execve+0x0    [DISABLED][FTRACE]
      
      3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes
         kprobes_all_disarmed to false but does not arm the disabled kprobe.
      
        # echo 1 > /sys/kernel/debug/kprobes/enabled
      
        # cat /sys/kernel/debug/kprobes/list
        ffffffff91345650  r  __x64_sys_execve+0x0    [FTRACE]
        ffffffff91345650  k  __x64_sys_execve+0x0    [DISABLED][FTRACE]
      
      4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the
         disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace().
      
        # fg
        /usr/share/bcc/tools/execsnoop
        ^C
      
      Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses
      some cleanups and leaves the aggregated kprobe in the hash table.  Then,
      __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an
      infinite loop like this.
      
        aggregated kprobe.list -> kprobe.list -.
                                           ^    |
                                           '.__.'
      
      In this situation, these commands fall into the infinite loop and result
      in RCU stall or soft lockup.
      
        cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the
                                             infinite loop with RCU.
      
        /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex,
                                         and __get_valid_kprobe() is stuck in
      				   the loop.
      
      To avoid the issue, make sure we don't call disarm_kprobe() for disabled
      kprobes.
      
      [0]
      Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2)
      WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129)
      Modules linked in: ena
      CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28
      Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017
      RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129)
      Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94
      RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001
      RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff
      RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff
      R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40
      R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000
      FS:  00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
      <TASK>
       __disable_kprobe (kernel/kprobes.c:1716)
       disable_kprobe (kernel/kprobes.c:2392)
       __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340)
       disable_trace_kprobe (kernel/trace/trace_kprobe.c:429)
       perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168)
       perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295)
       _free_event (kernel/events/core.c:4971)
       perf_event_release_kernel (kernel/events/core.c:5176)
       perf_release (kernel/events/core.c:5186)
       __fput (fs/file_table.c:321)
       task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1))
       exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201)
       syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296)
       do_syscall_64 (arch/x86/entry/common.c:87)
       entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      RIP: 0033:0x7fe7ff210654
      Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc
      RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
      RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654
      RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008
      RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30
      R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600
      R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560
      </TASK>
      
      Link: https://lkml.kernel.org/r/20220813020509.90805-1-kuniyu@amazon.com
      Fixes: 69d54b91
      
       ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reported-by: default avatarAyushman Dutta <ayudutta@amazon.com>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
      Cc: Kuniyuki Iwashima <kuni1840@gmail.com>
      Cc: Ayushman Dutta <ayudutta@amazon.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc91d2db
    • Geert Uytterhoeven's avatar
      netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y · f7de9e58
      Geert Uytterhoeven authored
      [ Upstream commit aa5762c3 ]
      
      NF_CONNTRACK_PROCFS was marked obsolete in commit 54b07dca
      
      
      ("netfilter: provide config option to disable ancient procfs parts") in
      v3.3.
      
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f7de9e58
    • Juergen Gross's avatar
      s390/hypfs: avoid error message under KVM · a40d5308
      Juergen Gross authored
      [ Upstream commit 7b6670b0
      
       ]
      
      When booting under KVM the following error messages are issued:
      
      hypfs.7f5705: The hardware system does not support hypfs
      hypfs.7a79f0: Initialization of hypfs failed with rc=-61
      
      Demote the severity of first message from "error" to "info" and issue
      the second message only in other error cases.
      
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220620094534.18967-1-jgross@suse.com
      [arch/s390/hypfs/hypfs_diag.c changed description]
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a40d5308
    • Denis V. Lunev's avatar
      neigh: fix possible DoS due to net iface start/stop loop · 05fdce1a
      Denis V. Lunev authored
      [ Upstream commit 66ba215c
      
       ]
      
      Normal processing of ARP request (usually this is Ethernet broadcast
      packet) coming to the host is looking like the following:
      * the packet comes to arp_process() call and is passed through routing
        procedure
      * the request is put into the queue using pneigh_enqueue() if
        corresponding ARP record is not local (common case for container
        records on the host)
      * the request is processed by timer (within 80 jiffies by default) and
        ARP reply is sent from the same arp_process() using
        NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED condition (flag is set inside
        pneigh_enqueue())
      
      And here the problem comes. Linux kernel calls pneigh_queue_purge()
      which destroys the whole queue of ARP requests on ANY network interface
      start/stop event through __neigh_ifdown().
      
      This is actually not a problem within the original world as network
      interface start/stop was accessible to the host 'root' only, which
      could do more destructive things. But the world is changed and there
      are Linux containers available. Here container 'root' has an access
      to this API and could be considered as untrusted user in the hosting
      (container's) world.
      
      Thus there is an attack vector to other containers on node when
      container's root will endlessly start/stop interfaces. We have observed
      similar situation on a real production node when docker container was
      doing such activity and thus other containers on the node become not
      accessible.
      
      The patch proposed doing very simple thing. It drops only packets from
      the same namespace in the pneigh_queue_purge() where network interface
      state change is detected. This is enough to prevent the problem for the
      whole node preserving original semantics of the code.
      
      v2:
      	- do del_timer_sync() if queue is empty after pneigh_queue_purge()
      v3:
      	- rebase to net tree
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Yajun Deng <yajun.deng@linux.dev>
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Cc: Konstantin Khorenko <khorenko@virtuozzo.com>
      Cc: kernel@openvz.org
      Cc: devel@openvz.org
      Investigated-by: default avatarAlexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      05fdce1a
    • Fudong Wang's avatar
      drm/amd/display: clear optc underflow before turn off odm clock · 44368779
      Fudong Wang authored
      [ Upstream commit b2a93490
      
       ]
      
      [Why]
      After ODM clock off, optc underflow bit will be kept there always and clear not work.
      We need to clear that before clock off.
      
      [How]
      Clear that if have when clock off.
      
      Reviewed-by: default avatarAlvin Lee <alvin.lee2@amd.com>
      Acked-by: default avatarTom Chung <chiahsuan.chung@amd.com>
      Signed-off-by: default avatarFudong Wang <Fudong.Wang@amd.com>
      Tested-by: default avatarDaniel Wheeler <daniel.wheeler@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      44368779
    • Jann Horn's avatar
      mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse · 6dbfc25d
      Jann Horn authored
      commit 2555283e upstream.
      
      anon_vma->degree tracks the combined number of child anon_vmas and VMAs
      that use the anon_vma as their ->anon_vma.
      
      anon_vma_clone() then assumes that for any anon_vma attached to
      src->anon_vma_chain other than src->anon_vma, it is impossible for it to
      be a leaf node of the VMA tree, meaning that for such VMAs ->degree is
      elevated by 1 because of a child anon_vma, meaning that if ->degree
      equals 1 there are no VMAs that use the anon_vma as their ->anon_vma.
      
      This assumption is wrong because the ->degree optimization leads to leaf
      nodes being abandoned on anon_vma_clone() - an existing anon_vma is
      reused and no new parent-child relationship is created.  So it is
      possible to reuse an anon_vma for one VMA while it is still tied to
      another VMA.
      
      This is an issue because is_mergeable_anon_vma() and its callers assume
      that if two VMAs have the same ->anon_vma, the list of anon_vmas
      attached to the VMAs is guaranteed to be the same.  When this assumption
      is violated, vma_merge() can merge pages into a VMA that is not attached
      to the corresponding anon_vma, leading to dangling page->mapping
      pointers that will be dereferenced during rmap walks.
      
      Fix it by separately tracking the number of child anon_vmas and the
      number of VMAs using the anon_vma as their ->anon_vma.
      
      Fixes: 7a3ef208
      
       ("mm: prevent endless growth of anon_vma hierarchy")
      Cc: stable@kernel.org
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6dbfc25d
    • Yang Jihong's avatar
      ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead · ddffe882
      Yang Jihong authored
      commit c3b0f72e
      
       upstream.
      
      ftrace_startup does not remove ops from ftrace_ops_list when
      ftrace_startup_enable fails:
      
      register_ftrace_function
        ftrace_startup
          __register_ftrace_function
            ...
            add_ftrace_ops(&ftrace_ops_list, ops)
            ...
          ...
          ftrace_startup_enable // if ftrace failed to modify, ftrace_disabled is set to 1
          ...
        return 0 // ops is in the ftrace_ops_list.
      
      When ftrace_disabled = 1, unregister_ftrace_function simply returns without doing anything:
      unregister_ftrace_function
        ftrace_shutdown
          if (unlikely(ftrace_disabled))
                  return -ENODEV;  // return here, __unregister_ftrace_function is not executed,
                                   // as a result, ops is still in the ftrace_ops_list
          __unregister_ftrace_function
          ...
      
      If ops is dynamically allocated, it will be free later, in this case,
      is_ftrace_trampoline accesses NULL pointer:
      
      is_ftrace_trampoline
        ftrace_ops_trampoline
          do_for_each_ftrace_op(op, ftrace_ops_list) // OOPS! op may be NULL!
      
      Syzkaller reports as follows:
      [ 1203.506103] BUG: kernel NULL pointer dereference, address: 000000000000010b
      [ 1203.508039] #PF: supervisor read access in kernel mode
      [ 1203.508798] #PF: error_code(0x0000) - not-present page
      [ 1203.509558] PGD 800000011660b067 P4D 800000011660b067 PUD 130fb8067 PMD 0
      [ 1203.510560] Oops: 0000 [#1] SMP KASAN PTI
      [ 1203.511189] CPU: 6 PID: 29532 Comm: syz-executor.2 Tainted: G    B   W         5.10.0 #8
      [ 1203.512324] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [ 1203.513895] RIP: 0010:is_ftrace_trampoline+0x26/0xb0
      [ 1203.514644] Code: ff eb d3 90 41 55 41 54 49 89 fc 55 53 e8 f2 00 fd ff 48 8b 1d 3b 35 5d 03 e8 e6 00 fd ff 48 8d bb 90 00 00 00 e8 2a 81 26 00 <48> 8b ab 90 00 00 00 48 85 ed 74 1d e8 c9 00 fd ff 48 8d bb 98 00
      [ 1203.518838] RSP: 0018:ffffc900012cf960 EFLAGS: 00010246
      [ 1203.520092] RAX: 0000000000000000 RBX: 000000000000007b RCX: ffffffff8a331866
      [ 1203.521469] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000000010b
      [ 1203.522583] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8df18b07
      [ 1203.523550] R10: fffffbfff1be3160 R11: 0000000000000001 R12: 0000000000478399
      [ 1203.524596] R13: 0000000000000000 R14: ffff888145088000 R15: 0000000000000008
      [ 1203.525634] FS:  00007f429f5f4700(0000) GS:ffff8881daf00000(0000) knlGS:0000000000000000
      [ 1203.526801] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1203.527626] CR2: 000000000000010b CR3: 0000000170e1e001 CR4: 00000000003706e0
      [ 1203.528611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1203.529605] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Therefore, when ftrace_startup_enable fails, we need to rollback registration
      process and remove ops from ftrace_ops_list.
      
      Link: https://lkml.kernel.org/r/20220818032659.56209-1-yangjihong1@huawei.com
      
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddffe882
    • Letu Ren's avatar
      fbdev: fb_pm2fb: Avoid potential divide by zero error · 7f88cdfe
      Letu Ren authored
      commit 19f953e7
      
       upstream.
      
      In `do_fb_ioctl()` of fbmem.c, if cmd is FBIOPUT_VSCREENINFO, var will be
      copied from user, then go through `fb_set_var()` and
      `info->fbops->fb_check_var()` which could may be `pm2fb_check_var()`.
      Along the path, `var->pixclock` won't be modified. This function checks
      whether reciprocal of `var->pixclock` is too high. If `var->pixclock` is
      zero, there will be a divide by zero error. So, it is necessary to check
      whether denominator is zero to avoid crash. As this bug is found by
      Syzkaller, logs are listed below.
      
      divide error in pm2fb_check_var
      Call Trace:
       <TASK>
       fb_set_var+0x367/0xeb0 drivers/video/fbdev/core/fbmem.c:1015
       do_fb_ioctl+0x234/0x670 drivers/video/fbdev/core/fbmem.c:1110
       fb_ioctl+0xdd/0x130 drivers/video/fbdev/core/fbmem.c:1189
      
      Reported-by: default avatarZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: default avatarLetu Ren <fantasquex@gmail.com>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7f88cdfe