Skip to content
  1. Feb 23, 2022
    • Eric Dumazet's avatar
      net: sched: limit TC_ACT_REPEAT loops · 132de3a6
      Eric Dumazet authored
      commit 5740d068 upstream.
      
      We have been living dangerously, at the mercy of malicious users,
      abusing TC_ACT_REPEAT, as shown by this syzpot report [1].
      
      Add an arbitrary limit (32) to the number of times an action can
      return TC_ACT_REPEAT.
      
      v2: switch the limit to 32 instead of 10.
          Use net_warn_ratelimited() instead of pr_err_once().
      
      [1] (C repro available on demand)
      
      rcu: INFO: rcu_preempt self-detected stall on CPU
      rcu:    1-...!: (10500 ticks this GP) idle=021/1/0x4000000000000000 softirq=5592/5592 fqs=0
              (t=10502 jiffies g=5305 q=190)
      rcu: rcu_preempt kthread timer wakeup didn't happen for 10502 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
      rcu:    Possible timer handling issue on cpu=0 timer-softirq=3527
      rcu: rcu_preempt kthread starved for 10505 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
      rcu:    Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
      rcu: RCU grace-period kthread stack dump:
      task:rcu_preempt     state:I stack:29344 pid:   14 ppid:     2 flags:0x00004000
      Call Trace:
       <TASK>
       context_switch kernel/sched/core.c:4986 [inline]
       __schedule+0xab2/0x4db0 kernel/sched/core.c:6295
       schedule+0xd2/0x260 kernel/sched/core.c:6368
       schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881
       rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1963
       rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2136
       kthread+0x2e9/0x3a0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      rcu: Stack dump where RCU GP kthread last ran:
      Sending NMI from CPU 1 to CPUs 0:
      NMI backtrace for cpu 0
      CPU: 0 PID: 3646 Comm: syz-executor358 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline]
      RIP: 0010:cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline]
      RIP: 0010:pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:437 [inline]
      RIP: 0010:__pv_queued_spin_lock_slowpath+0x3b8/0xb40 kernel/locking/qspinlock.c:508
      Code: 48 89 eb c6 45 01 01 41 bc 00 80 00 00 48 c1 e9 03 83 e3 07 41 be 01 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 2c 01 eb 0c <f3> 90 41 83 ec 01 0f 84 72 04 00 00 41 0f b6 45 00 38 d8 7f 08 84
      RSP: 0018:ffffc9000283f1b0 EFLAGS: 00000206
      RAX: 0000000000000003 RBX: 0000000000000000 RCX: 1ffff1100fc0071e
      RDX: 0000000000000001 RSI: 0000000000000201 RDI: 0000000000000000
      RBP: ffff88807e0038f0 R08: 0000000000000001 R09: ffffffff8ffbf9ff
      R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000004c1e
      R13: ffffed100fc0071e R14: 0000000000000001 R15: ffff8880b9c3aa80
      FS:  00005555562bf300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffdbfef12b8 CR3: 00000000723c2000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
       queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
       queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
       do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:115
       spin_lock_bh include/linux/spinlock.h:354 [inline]
       sch_tree_lock include/net/sch_generic.h:610 [inline]
       sch_tree_lock include/net/sch_generic.h:605 [inline]
       prio_tune+0x3b9/0xb50 net/sched/sch_prio.c:211
       prio_init+0x5c/0x80 net/sched/sch_prio.c:244
       qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253
       tc_modify_qdisc+0x4c5/0x1980 net/sched/sch_api.c:1660
       rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5594
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:725
       ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f7ee98aae99
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffdbfef12d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007ffdbfef1300 RCX: 00007f7ee98aae99
      RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d
      R10: 000000000000000d R11: 0000000000000246 R12: 00007ffdbfef12f0
      R13: 00000000000f4240 R14: 000000000004ca47 R15: 00007ffdbfef12e4
       </TASK>
      INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.293 msecs
      NMI backtrace for cpu 1
      CPU: 1 PID: 3260 Comm: kworker/1:3 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: mld mld_ifc_work
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
       nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
       trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
       rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343
       print_cpu_stall kernel/rcu/tree_stall.h:604 [inline]
       check_cpu_stall kernel/rcu/tree_stall.h:688 [inline]
       rcu_pending kernel/rcu/tree.c:3919 [inline]
       rcu_sched_clock_irq.cold+0x5c/0x759 kernel/rcu/tree.c:2617
       update_process_times+0x16d/0x200 kernel/time/timer.c:1785
       tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226
       tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428
       __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
       __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749
       hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811
       local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline]
       __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103
       sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097
       </IRQ>
       <TASK>
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
      RIP: 0010:__sanitizer_cov_trace_const_cmp4+0xc/0x70 kernel/kcov.c:286
      Code: 00 00 00 48 89 7c 30 e8 48 89 4c 30 f0 4c 89 54 d8 20 48 89 10 5b c3 0f 1f 80 00 00 00 00 41 89 f8 bf 03 00 00 00 4c 8b 14 24 <89> f1 65 48 8b 34 25 00 70 02 00 e8 14 f9 ff ff 84 c0 74 4b 48 8b
      RSP: 0018:ffffc90002c5eea8 EFLAGS: 00000246
      RAX: 0000000000000007 RBX: ffff88801c625800 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: ffff8880137d3100 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff874fcd88 R11: 0000000000000000 R12: ffff88801d692dc0
      R13: ffff8880137d3104 R14: 0000000000000000 R15: ffff88801d692de8
       tcf_police_act+0x358/0x11d0 net/sched/act_police.c:256
       tcf_action_exec net/sched/act_api.c:1049 [inline]
       tcf_action_exec+0x1a6/0x530 net/sched/act_api.c:1026
       tcf_exts_exec include/net/pkt_cls.h:326 [inline]
       route4_classify+0xef0/0x1400 net/sched/cls_route.c:179
       __tcf_classify net/sched/cls_api.c:1549 [inline]
       tcf_classify+0x3e8/0x9d0 net/sched/cls_api.c:1615
       prio_classify net/sched/sch_prio.c:42 [inline]
       prio_enqueue+0x3a7/0x790 net/sched/sch_prio.c:75
       dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3668
       __dev_xmit_skb net/core/dev.c:3756 [inline]
       __dev_queue_xmit+0x1f61/0x3660 net/core/dev.c:4081
       neigh_hh_output include/net/neighbour.h:533 [inline]
       neigh_output include/net/neighbour.h:547 [inline]
       ip_finish_output2+0x14dc/0x2170 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:306 [inline]
       __ip_finish_output+0x396/0x650 net/ipv4/ip_output.c:288
       ip_finish_output+0x32/0x200 net/ipv4/ip_output.c:316
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip_output+0x196/0x310 net/ipv4/ip_output.c:430
       dst_output include/net/dst.h:451 [inline]
       ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:126
       iptunnel_xmit+0x628/0xa50 net/ipv4/ip_tunnel_core.c:82
       geneve_xmit_skb drivers/net/geneve.c:966 [inline]
       geneve_xmit+0x10c8/0x3530 drivers/net/geneve.c:1077
       __netdev_start_xmit include/linux/netdevice.h:4683 [inline]
       netdev_start_xmit include/linux/netdevice.h:4697 [inline]
       xmit_one net/core/dev.c:3473 [inline]
       dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3489
       __dev_queue_xmit+0x2985/0x3660 net/core/dev.c:4116
       neigh_hh_output include/net/neighbour.h:533 [inline]
       neigh_output include/net/neighbour.h:547 [inline]
       ip6_finish_output2+0xf7a/0x14f0 net/ipv6/ip6_output.c:126
       __ip6_finish_output net/ipv6/ip6_output.c:191 [inline]
       __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170
       ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201
       NF_HOOK_COND include/linux/netfilter.h:296 [inline]
       ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224
       dst_output include/net/dst.h:451 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       NF_HOOK include/linux/netfilter.h:301 [inline]
       mld_sendpack+0x9a3/0xe40 net/ipv6/mcast.c:1826
       mld_send_cr net/ipv6/mcast.c:2127 [inline]
       mld_ifc_work+0x71c/0xdc0 net/ipv6/mcast.c:2659
       process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
       worker_thread+0x657/0x1110 kernel/workqueue.c:2454
       kthread+0x2e9/0x3a0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      ----------------
      Code disassembly (best guess):
         0:   48 89 eb                mov    %rbp,%rbx
         3:   c6 45 01 01             movb   $0x1,0x1(%rbp)
         7:   41 bc 00 80 00 00       mov    $0x8000,%r12d
         d:   48 c1 e9 03             shr    $0x3,%rcx
        11:   83 e3 07                and    $0x7,%ebx
        14:   41 be 01 00 00 00       mov    $0x1,%r14d
        1a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
        21:   fc ff df
        24:   4c 8d 2c 01             lea    (%rcx,%rax,1),%r13
        28:   eb 0c                   jmp    0x36
      * 2a:   f3 90                   pause <-- trapping instruction
        2c:   41 83 ec 01             sub    $0x1,%r12d
        30:   0f 84 72 04 00 00       je     0x4a8
        36:   41 0f b6 45 00          movzbl 0x0(%r13),%eax
        3b:   38 d8                   cmp    %bl,%al
        3d:   7f 08                   jg     0x47
        3f:   84                      .byte 0x84
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20220215235305.3272331-1-eric.dumazet@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      132de3a6
    • Eric W. Biederman's avatar
      ucounts: Move RLIMIT_NPROC handling after set_user · 5f68f27d
      Eric W. Biederman authored
      commit c923a8e7 upstream.
      
      During set*id() which cred->ucounts to charge the the current process
      to is not known until after set_cred_ucounts.  So move the
      RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call
      flag_nproc_exceeded after set_cred_ucounts.
      
      This is very much an arbitrary subset of the places where we currently
      change the RLIMIT_NPROC accounting, designed to preserve the existing
      logic.
      
      Fixing the existing logic will be the subject of another series of
      changes.
      
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com
      Fixes: 21d1c5e3
      
       ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f68f27d
    • Eric W. Biederman's avatar
      rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user · 6f6e8ccb
      Eric W. Biederman authored
      commit c16bdeb5 upstream.
      
      Solar Designer <solar@openwall.com> wrote:
      > I'm not aware of anyone actually running into this issue and reporting
      > it.  The systems that I personally know use suexec along with rlimits
      > still run older/distro kernels, so would not yet be affected.
      >
      > So my mention was based on my understanding of how suexec works, and
      > code review.  Specifically, Apache httpd has the setting RLimitNPROC,
      > which makes it set RLIMIT_NPROC:
      >
      > https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc
      >
      > The above documentation for it includes:
      >
      > "This applies to processes forked from Apache httpd children servicing
      > requests, not the Apache httpd children themselves. This includes CGI
      > scripts and SSI exec commands, but not any processes forked from the
      > Apache httpd parent, such as piped logs."
      >
      > In code, there are:
      >
      > ./modules/generators/mod_cgid.c:        ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
      > ./modules/generators/mod_cgi.c:        ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
      > ./modules/filters/mod_ext_filter.c:    rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);
      >
      > For example, in mod_cgi.c this is in run_cgi_child().
      >
      > I think this means an httpd child sets RLIMIT_NPROC shortly before it
      > execs suexec, which is a SUID root program.  suexec then switches to the
      > target user and execs the CGI script.
      >
      > Before 2863643f, the setuid() in suexec would set the flag, and the
      > target user's process count would be checked against RLIMIT_NPROC on
      > execve().  After 2863643f, the setuid() in suexec wouldn't set the
      > flag because setuid() is (naturally) called when the process is still
      > running as root (thus, has those limits bypass capabilities), and
      > accordingly execve() would not check the target user's process count
      > against RLIMIT_NPROC.
      
      In commit 2863643f ("set_user: add capability check when
      rlimit(RLIMIT_NPROC) exceeds") capable calls were added to set_user to
      make it more consistent with fork.  Unfortunately because of call site
      differences those capable calls were checking the credentials of the
      user before set*id() instead of after set*id().
      
      This breaks enforcement of RLIMIT_NPROC for applications that set the
      rlimit and then call set*id() while holding a full set of
      capabilities.  The capabilities are only changed in the new credential
      in security_task_fix_setuid().
      
      The code in apache suexec appears to follow this pattern.
      
      Commit 909cc4ae86f3 ("[PATCH] Fix two bugs with process limits
      (RLIMIT_NPROC)") where this check was added describes the targes of this
      capability check as:
      
        2/ When a root-owned process (e.g. cgiwrap) sets up process limits and then
            calls setuid, the setuid should fail if the user would then be running
            more than rlim_cur[RLIMIT_NPROC] processes, but it doesn't.  This patch
            adds an appropriate test.  With this patch, and per-user process limit
            imposed in cgiwrap really works.
      
      So the original use case of this check also appears to match the broken
      pattern.
      
      Restore the enforcement of RLIMIT_NPROC by removing the bad capable
      checks added in set_user.  This unfortunately restores the
      inconsistent state the code has been in for the last 11 years, but
      dealing with the inconsistencies looks like a larger problem.
      
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/all/20210907213042.GA22626@openwall.com/
      Link: https://lkml.kernel.org/r/20220212221412.GA29214@openwall.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-1-ebiederm@xmission.com
      Fixes: 2863643f
      
       ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
      History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
      Reviewed-by: default avatarSolar Designer <solar@openwall.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f6e8ccb
    • Eric W. Biederman's avatar
      ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 · 4ac77eb2
      Eric W. Biederman authored
      commit 8f2f9c4d upstream.
      
      Michal Koutný <mkoutny@suse.com> wrote:
      
      > It was reported that v5.14 behaves differently when enforcing
      > RLIMIT_NPROC limit, namely, it allows one more task than previously.
      > This is consequence of the commit 21d1c5e3
      
       ("Reimplement
      > RLIMIT_NPROC on top of ucounts") that missed the sharpness of
      > equality in the forking path.
      
      This can be fixed either by fixing the test or by moving the increment
      to be before the test.  Fix it my moving copy_creds which contains
      the increment before is_ucounts_overlimit.
      
      In the case of CLONE_NEWUSER the ucounts in the task_cred changes.
      The function is_ucounts_overlimit needs to use the final version of
      the ucounts for the new process.  Which means moving the
      is_ucounts_overlimit test after copy_creds is necessary.
      
      Both the test in fork and the test in set_user were semantically
      changed when the code moved to ucounts.  The change of the test in
      fork was bad because it was before the increment.  The test in
      set_user was wrong and the change to ucounts fixed it.  So this
      fix only restores the old behavior in one lcation not two.
      
      Link: https://lkml.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-2-ebiederm@xmission.com
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarMichal Koutný <mkoutny@suse.com>
      Fixes: 21d1c5e3
      
       ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ac77eb2
    • Eric W. Biederman's avatar
      ucounts: Handle wrapping in is_ucounts_overlimit · ce6c0486
      Eric W. Biederman authored
      commit 0cbae9e2 upstream.
      
      While examining is_ucounts_overlimit and reading the various messages
      I realized that is_ucounts_overlimit fails to deal with counts that
      may have wrapped.
      
      Being wrapped should be a transitory state for counts and they should
      never be wrapped for long, but it can happen so handle it.
      
      Cc: stable@vger.kernel.org
      Fixes: 21d1c5e3
      
       ("Reimplement RLIMIT_NPROC on top of ucounts")
      Link: https://lkml.kernel.org/r/20220216155832.680775-5-ebiederm@xmission.com
      Reviewed-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce6c0486
    • Eric W. Biederman's avatar
      ucounts: Base set_cred_ucounts changes on the real user · 0c6c4d6d
      Eric W. Biederman authored
      commit a55d0729 upstream.
      
      Michal Koutný <mkoutny@suse.com> wrote:
      > Tasks are associated to multiple users at once. Historically and as per
      > setrlimit(2) RLIMIT_NPROC is enforce based on real user ID.
      >
      > The commit 21d1c5e3
      
       ("Reimplement RLIMIT_NPROC on top of ucounts")
      > made the accounting structure "indexed" by euid and hence potentially
      > account tasks differently.
      >
      > The effective user ID may be different e.g. for setuid programs but
      > those are exec'd into already existing task (i.e. below limit), so
      > different accounting is moot.
      >
      > Some special setresuid(2) users may notice the difference, justifying
      > this fix.
      
      I looked at cred->ucount and it is only used for rlimit operations
      that were previously stored in cred->user.  Making the fact
      cred->ucount can refer to a different user from cred->user a bug,
      affecting all uses of cred->ulimit not just RLIMIT_NPROC.
      
      Fix set_cred_ucounts to always use the real uid not the effective uid.
      
      Further simplify set_cred_ucounts by noticing that set_cred_ucounts
      somehow retained a draft version of the check to see if alloc_ucounts
      was needed that checks the new->user and new->user_ns against the
      current_real_cred().  Remove that draft version of the check.
      
      All that matters for setting the cred->ucounts are the user_ns and uid
      fields in the cred.
      
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20220207121800.5079-4-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-3-ebiederm@xmission.com
      Reported-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarMichal Koutný <mkoutny@suse.com>
      Fixes: 21d1c5e3
      
       ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c6c4d6d
    • Andy Lutomirski's avatar
      x86/ptrace: Fix xfpregs_set()'s incorrect xmm clearing · 99dbf353
      Andy Lutomirski authored
      commit 44cad52c upstream.
      
      xfpregs_set() handles 32-bit REGSET_XFP and 64-bit REGSET_FP. The actual
      code treats these regsets as modern FX state (i.e. the beginning part of
      XSTATE). The declarations of the regsets thought they were the legacy
      i387 format. The code thought they were the 32-bit (no xmm8..15) variant
      of XSTATE and, for good measure, made the high bits disappear by zeroing
      the wrong part of the buffer. The latter broke ptrace, and everything
      else confused anyone trying to understand the code. In particular, the
      nonsense definitions of the regsets confused me when I wrote this code.
      
      Clean this all up. Change the declarations to match reality (which
      shouldn't change the generated code, let alone the ABI) and fix
      xfpregs_set() to clear the correct bits and to only do so for 32-bit
      callers.
      
      Fixes: 6164331d
      
       ("x86/fpu: Rewrite xfpregs_set()")
      Reported-by: default avatarLuís Ferreira <contact@lsferreira.net>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215524
      Link: https://lore.kernel.org/r/YgpFnZpF01WwR8wU@zn.tnic
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99dbf353
    • Eliav Farber's avatar
      EDAC: Fix calculation of returned address and next offset in edac_align_ptr() · 624c164c
      Eliav Farber authored
      commit f8efca92 upstream.
      
      Do alignment logic properly and use the "ptr" local variable for
      calculating the remainder of the alignment.
      
      This became an issue because struct edac_mc_layer has a size that is not
      zero modulo eight, and the next offset that was prepared for the private
      data was unaligned, causing an alignment exception.
      
      The patch in Fixes: which broke this actually wanted to "what we
      actually care about is the alignment of the actual pointer that's about
      to be returned." But it didn't check that alignment.
      
      Use the correct variable "ptr" for that.
      
        [ bp: Massage commit message. ]
      
      Fixes: 8447c4d1
      
       ("edac: Do alignment logic properly in edac_align_ptr()")
      Signed-off-by: default avatarEliav Farber <farbere@amazon.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      624c164c
    • James Smart's avatar
      scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop · dc426f86
      James Smart authored
      commit 7f4c5a26
      
       upstream.
      
      When connected point to point, the driver does not know the FC4's supported
      by the other end. In Fabrics, it can query the nameserver.  Thus the driver
      must send PRLIs for the FC4s it supports and enable support based on the
      acc(ept) or rej(ect) of the respective FC4 PRLI.  Currently the driver
      supports SCSI and NVMe PRLIs.
      
      Unfortunately, although the behavior is per standard, many devices have
      come to expect only SCSI PRLIs. In this particular example, the NVMe PRLI
      is properly RJT'd but the target decided that it must LOGO after seeing the
      unexpected NVMe PRLI. The LOGO causes the sequence to restart and login is
      now in an infinite failure loop.
      
      Fix the problem by having the driver, on a pt2pt link, remember NVMe PRLI
      accept or reject status across logout as long as the link stays "up".  When
      retrying login, if the prior NVMe PRLI was rejected, it will not be sent on
      the next login.
      
      Link: https://lore.kernel.org/r/20220212163120.15385-1-jsmart2021@gmail.com
      Cc: <stable@vger.kernel.org> # v5.4+
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc426f86
    • david regan's avatar
      mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status · 813ec08e
      david regan authored
      commit 36415a79 upstream.
      
      The brcmnand driver contains a bug in which if a page (example 2k byte)
      is read from the parallel/ONFI NAND and within that page a subpage (512
      byte) has correctable errors which is followed by a subpage with
      uncorrectable errors, the page read will return the wrong status of
      correctable (as opposed to the actual status of uncorrectable.)
      
      The bug is in function brcmnand_read_by_pio where there is a check for
      uncorrectable bits which will be preempted if a previous status for
      correctable bits is detected.
      
      The fix is to stop checking for bad bits only if we already have a bad
      bits status.
      
      Fixes: 27c5b17c
      
       ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
      Signed-off-by: default avatardavid regan <dregan@mail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/trinity-478e0c09-9134-40e8-8f8c-31c371225eda-1643237024774@3c-app-mailcom-lxa02
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      813ec08e
    • Dan Carpenter's avatar
      mtd: phram: Prevent divide by zero bug in phram_setup() · 0d9cbbf9
      Dan Carpenter authored
      commit 3e376587 upstream.
      
      The problem is that "erasesize" is a uint64_t type so it might be
      non-zero but the lower 32 bits are zero so when it's truncated,
      "(uint32_t)erasesize", then that value is zero. This leads to a
      divide by zero bug.
      
      Avoid the bug by delaying the divide until after we have validated
      that "erasesize" is non-zero and within the uint32_t range.
      
      Fixes: dc2b3e5c
      
       ("mtd: phram: use div_u64_rem to stop overwrite len in phram_setup")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220121115505.GI1978@kadam
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d9cbbf9
    • Christian Marangi's avatar
      mtd: parsers: qcom: Fix missing free for pparts in cleanup · 1b37889f
      Christian Marangi authored
      commit 3dd8ba96 upstream.
      
      Mtdpart doesn't free pparts when a cleanup function is declared.
      Add missing free for pparts in cleanup function for smem to fix the
      leak.
      
      Fixes: 10f3b4d7
      
       ("mtd: parsers: qcom: Fix leaking of partition name")
      Signed-off-by: default avatarAnsuel Smith <ansuelsmth@gmail.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220116032211.9728-2-ansuelsmth@gmail.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b37889f
    • Christian Marangi's avatar
      mtd: parsers: qcom: Fix kernel panic on skipped partition · a2995fe2
      Christian Marangi authored
      commit 65d003cc upstream.
      
      In the event of a skipped partition (case when the entry name is empty)
      the kernel panics in the cleanup function as the name entry is NULL.
      Rework the parser logic by first checking the real partition number and
      then allocate the space and set the data for the valid partitions.
      
      The logic was also fundamentally wrong as with a skipped partition, the
      parts number returned was incorrect by not decreasing it for the skipped
      partitions.
      
      Fixes: 803eb124
      
       ("mtd: parsers: Add Qcom SMEM parser")
      Signed-off-by: default avatarAnsuel Smith <ansuelsmth@gmail.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220116032211.9728-1-ansuelsmth@gmail.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2995fe2
    • Bryan O'Donoghue's avatar
      mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe() · c2ca95fd
      Bryan O'Donoghue authored
      commit 5c23b3f9 upstream.
      
      Interacting with a NAND chip on an IPQ6018 I found that the qcomsmem NAND
      partition parser was returning -EPROBE_DEFER waiting for the main smem
      driver to load.
      
      This caused the board to reset. Playing about with the probe() function
      shows that the problem lies in the core clock being switched off before the
      nandc_unalloc() routine has completed.
      
      If we look at how qcom_nandc_remove() tears down allocated resources we see
      the expected order is
      
      qcom_nandc_unalloc(nandc);
      
      clk_disable_unprepare(nandc->aon_clk);
      clk_disable_unprepare(nandc->core_clk);
      
      dma_unmap_resource(&pdev->dev, nandc->base_dma, resource_size(res),
      		   DMA_BIDIRECTIONAL, 0);
      
      Tweaking probe() to both bring up and tear-down in that order removes the
      reset if we end up deferring elsewhere.
      
      Fixes: c76b78d8
      
       ("mtd: nand: Qualcomm NAND controller driver")
      Signed-off-by: default avatarBryan O'Donoghue <bryan.odonoghue@linaro.org>
      Reviewed-by: default avatarManivannan Sadhasivam <mani@kernel.org>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220103030316.58301-2-bryan.odonoghue@linaro.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2ca95fd
    • Christoph Hellwig's avatar
      block: fix surprise removal for drivers calling blk_set_queue_dying · 0da8318e
      Christoph Hellwig authored
      commit 7a5428dc upstream.
      
      Various block drivers call blk_set_queue_dying to mark a disk as dead due
      to surprise removal events, but since commit 8e141f9e that doesn't
      work given that the GD_DEAD flag needs to be set to stop I/O.
      
      Replace the driver calls to blk_set_queue_dying with a new (and properly
      documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the
      only remaining caller.
      
      Fixes: 8e141f9e
      
       ("block: drain file system I/O on del_gendisk")
      Reported-by: default avatarMarkus Blöchl <markus.bloechl@ipetronik.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0da8318e
    • Linus Torvalds's avatar
      tty: n_tty: do not look ahead for EOL character past the end of the buffer · ee421a75
      Linus Torvalds authored
      commit 35930307 upstream.
      
      Daniel Gibson reports that the n_tty code gets line termination wrong in
      very specific cases:
      
       "If you feed a line with exactly 64 chars + terminating newline, and
        directly afterwards (without reading) another line into a pseudo
        terminal, the the first read() on the other side will return the 64
        char line *without* terminating newline, and the next read() will
        return the missing terminating newline AND the complete next line (if
        it fits in the buffer)"
      
      and bisected the behavior to commit 3b830a9c ("tty: convert
      tty_ldisc_ops 'read()' function to take a kernel pointer").
      
      Now, digging deeper, it turns out that the behavior isn't exactly new:
      what changed in commit 3b830a9c was that the tty line discipline
      .read() function is now passed an intermediate kernel buffer rather than
      the final user space buffer.
      
      And that intermediate kernel buffer is 64 bytes in size - thus that
      special case with exactly 64 bytes plus terminating newline.
      
      The same problem did exist before, but historically the boundary was not
      the 64-byte chunk, but the user-supplied buffer size, which is obviously
      generally bigger (and potentially bigger than N_TTY_BUF_SIZE, which
      would hide the issue entirely).
      
      The reason is that the n_tty canon_copy_from_read_buf() code would look
      ahead for the EOL character one byte further than it would actually
      copy.  It would then decide that it had found the terminator, and unmark
      it as an EOL character - which in turn explains why the next read
      wouldn't then be terminated by it.
      
      Now, the reason it did all this in the first place is related to some
      historical and pretty obscure EOF behavior, see commit ac8f3bf8
      ("n_tty: Fix poll() after buffer-limited eof push read") and commit
      40d5e090 ("n_tty: Fix EOF push handling").
      
      And the reason for the EOL confusion is that we treat EOF as a special
      EOL condition, with the EOL character being NUL (aka "__DISABLED_CHAR"
      in the kernel sources).
      
      So that EOF look-ahead also affects the normal EOL handling.
      
      This patch just removes the look-ahead that causes problems, because EOL
      is much more critical than the historical "EOF in the middle of a line
      that coincides with the end of the buffer" handling ever was.
      
      Now, it is possible that we should indeed re-introduce the "look at next
      character to see if it's a EOF" behavior, but if so, that should be done
      not at the kernel buffer chunk boundary in canon_copy_from_read_buf(),
      but at a higher level, when we run out of the user buffer.
      
      In particular, the place to do that would be at the top of
      'n_tty_read()', where we check if it's a continuation of a previously
      started read, and there is no more buffer space left, we could decide to
      just eat the __DISABLED_CHAR at that point.
      
      But that would be a separate patch, because I suspect nobody actually
      cares, and I'd like to get a report about it before bothering.
      
      Fixes: 3b830a9c ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer")
      Fixes: ac8f3bf8 ("n_tty: Fix  poll() after buffer-limited eof push read")
      Fixes: 40d5e090
      
       ("n_tty: Fix EOF push handling")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215611
      Reported-and-tested-by: default avatarDaniel Gibson <metalcaedes@gmail.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jirislaby@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee421a75
    • Trond Myklebust's avatar
      NFS: Do not report writeback errors in nfs_getattr() · 855e6139
      Trond Myklebust authored
      commit d19e0183 upstream.
      
      The result of the writeback, whether it is an ENOSPC or an EIO, or
      anything else, does not inhibit the NFS client from reporting the
      correct file timestamps.
      
      Fixes: 79566ef0
      
       ("NFS: Getattr doesn't require data sync semantics")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      855e6139
    • Trond Myklebust's avatar
      NFS: LOOKUP_DIRECTORY is also ok with symlinks · aab7d08f
      Trond Myklebust authored
      commit e0caaf75 upstream.
      
      Commit ac795161
      
       (NFSv4: Handle case where the lookup of a directory
      fails) [1], part of Linux since 5.17-rc2, introduced a regression, where
      a symbolic link on an NFS mount to a directory on another NFS does not
      resolve(?) the first time it is accessed:
      
      Reported-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Fixes: ac795161
      
       ("NFSv4: Handle case where the lookup of a directory fails")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Tested-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aab7d08f
    • Trond Myklebust's avatar
      NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked() · 36143d0d
      Trond Myklebust authored
      commit 9d047bf6 upstream.
      
      In nfs4_update_changeattr_locked(), we don't need to set the
      NFS_INO_REVAL_PAGECACHE flag, because we already know the value of the
      change attribute, and we're already flagging the size. In fact, this
      forces us to revalidate the change attribute a second time for no good
      reason.
      This extra flag appears to have been introduced as part of the xattr
      feature, when update_changeattr_locked() was converted for use by the
      xattr code.
      
      Fixes: 1b523ca9
      
       ("nfs: modify update_changeattr to deal with regular files")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36143d0d
    • Laibin Qiu's avatar
      block/wbt: fix negative inflight counter when remove scsi device · 268b7ce2
      Laibin Qiu authored
      commit e92bc4cd upstream.
      
      Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
      wbt_disable_default() when switch elevator to bfq. And when
      we remove scsi device, wbt will be enabled by wbt_enable_default.
      If it become false positive between wbt_wait() and wbt_track()
      when submit write request.
      
      The following is the scenario that triggered the problem.
      
      T1                          T2                           T3
                                  elevator_switch_mq
                                  bfq_init_queue
                                  wbt_disable_default <= Set
                                  rwb->enable_state (OFF)
      Submit_bio
      blk_mq_make_request
      rq_qos_throttle
      <= rwb->enable_state (OFF)
                                                               scsi_remove_device
                                                               sd_remove
                                                               del_gendisk
                                                               blk_unregister_queue
                                                               elv_unregister_queue
                                                               wbt_enable_default
                                                               <= Set rwb->enable_state (ON)
      q_qos_track
      <= rwb->enable_state (ON)
      ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
      lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
      
      Fix this by move wbt_enable_default() from elv_unregister to
      bfq_exit_queue(). Only re-enable wbt when bfq exit.
      
      Fixes: 76a80408
      
       ("blk-wbt: make sure throttle is enabled properly")
      
      Remove oneline stale comment, and kill one oneshot local variable.
      
      Signed-off-by: default avatarMing Lei <ming.lei@rehdat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/
      Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      268b7ce2
    • Stephen Boyd's avatar
      ASoC: qcom: Actually clear DMA interrupt register for HDMI · 880982db
      Stephen Boyd authored
      commit c8d251f5 upstream.
      
      In commit da0363f7 ("ASoC: qcom: Fix for DMA interrupt clear reg
      overwriting") we changed regmap_write() to regmap_update_bits() so that
      we can avoid overwriting bits that we didn't intend to modify.
      Unfortunately this change breaks the case where a register is writable
      but not readable, which is exactly how the HDMI irq clear register is
      designed (grep around LPASS_HDMITX_APP_IRQCLEAR_REG to see how it's
      write only). That's because regmap_update_bits() tries to read the
      register from the hardware and if it isn't readable it looks in the
      regmap cache to see what was written there last time to compare against
      what we want to write there. Eventually, we're unable to modify this
      register at all because the bits that we're trying to set are already
      set in the cache.
      
      This is doubly bad for the irq clear register because you have to write
      the bit to clear an interrupt. Given the irq is level triggered, we see
      an interrupt storm upon plugging in an HDMI cable and starting audio
      playback. The irq storm is so great that performance degrades
      significantly, leading to CPU soft lockups.
      
      Fix it by using regmap_write_bits() so that we really do write the bits
      in the clear register that we want to. This brings the number of irqs
      handled by lpass_dma_interrupt_handler() down from ~150k/sec to ~10/sec.
      
      Fixes: da0363f7
      
       ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting")
      Cc: Srinivasa Rao Mandadapu <srivasam@codeaurora.org>
      Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Link: https://lore.kernel.org/r/20220209232520.4017634-1-swboyd@chromium.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      880982db
    • Martin Povišer's avatar
      ASoC: tas2770: Insert post reset delay · ce38a920
      Martin Povišer authored
      commit 307f3145 upstream.
      
      Per TAS2770 datasheet there must be a 1 ms delay from reset to first
      command. So insert delays into the driver where appropriate.
      
      Fixes: 1a476abc
      
       ("tas2770: add tas2770 smart PA kernel driver")
      Signed-off-by: default avatarMartin Povišer <povik+lin@cutebit.org>
      Link: https://lore.kernel.org/r/20220204095301.5554-1-povik+lin@cutebit.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce38a920
    • Bart Van Assche's avatar
      scsi: ufs: Fix a deadlock in the error handler · d69d98d8
      Bart Van Assche authored
      commit 945c3cca
      
       upstream.
      
      The following deadlock has been observed on a test setup:
      
       - All tags allocated
      
       - The SCSI error handler calls ufshcd_eh_host_reset_handler()
      
       - ufshcd_eh_host_reset_handler() queues work that calls
         ufshcd_err_handler()
      
       - ufshcd_err_handler() locks up as follows:
      
      Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt
      Call trace:
       __switch_to+0x298/0x5d8
       __schedule+0x6cc/0xa94
       schedule+0x12c/0x298
       blk_mq_get_tag+0x210/0x480
       __blk_mq_alloc_request+0x1c8/0x284
       blk_get_request+0x74/0x134
       ufshcd_exec_dev_cmd+0x68/0x640
       ufshcd_verify_dev_init+0x68/0x35c
       ufshcd_probe_hba+0x12c/0x1cb8
       ufshcd_host_reset_and_restore+0x88/0x254
       ufshcd_reset_and_restore+0xd0/0x354
       ufshcd_err_handler+0x408/0xc58
       process_one_work+0x24c/0x66c
       worker_thread+0x3e8/0xa4c
       kthread+0x150/0x1b4
       ret_from_fork+0x10/0x30
      
      Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
      request.
      
      Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org
      Tested-by: default avatarBean Huo <beanhuo@micron.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d69d98d8
    • Bart Van Assche's avatar
      scsi: ufs: Remove dead code · 84fdbb03
      Bart Van Assche authored
      commit d77ea822 upstream.
      
      Commit 7252a360
      
       ("scsi: ufs: Avoid busy-waiting by eliminating tag
      conflicts") guarantees that 'tag' is not in use by any SCSI command.
      Remove the check that returns early if a conflict occurs.
      
      Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org
      Tested-by: default avatarBean Huo <beanhuo@micron.com>
      Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
      Acked-by: default avatarAvri Altman <avri.altman@wdc.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84fdbb03
    • Jon Maloy's avatar
      tipc: fix wrong notification node addresses · 934c8c95
      Jon Maloy authored
      commit c08e5843 upstream.
      
      The previous bug fix had an unfortunate side effect that broke
      distribution of binding table entries between nodes. The updated
      tipc_sock_addr struct is also used further down in the same
      function, and there the old value is still the correct one.
      
      Fixes: 032062f3
      
       ("tipc: fix wrong publisher node address in link publications")
      Signed-off-by: default avatarJon Maloy <jmaloy@redhat.com>
      Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      934c8c95
    • Steve French's avatar
      smb3: fix snapshot mount option · e124fe29
      Steve French authored
      commit 9405b5f8 upstream.
      
      The conversion to the new API broke the snapshot mount option
      due to 32 vs. 64 bit type mismatch
      
      Fixes: 24e0a1ef
      
       ("cifs: switch to new mount api")
      Cc: stable@vger.kernel.org # 5.11+
      Reported-by: default avatar <ruckajan10@gmail.com>
      Acked-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e124fe29
    • Christian Eggers's avatar
      mtd: rawnand: gpmi: don't leak PM reference in error path · 58d3111e
      Christian Eggers authored
      commit 9161f365
      
       upstream.
      
      If gpmi_nfc_apply_timings() fails, the PM runtime usage counter must be
      dropped.
      
      Reported-by: default avatarPavel Machek <pavel@denx.de>
      Fixes: f53d4c10
      
       ("mtd: rawnand: gpmi: Add ERR007117 protection for nfc_apply_timings")
      Signed-off-by: default avatarChristian Eggers <ceggers@arri.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20220125081619.6286-1-ceggers@arri.de
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58d3111e
    • Anders Roxell's avatar
      powerpc/lib/sstep: fix 'ptesync' build error · 2b61859f
      Anders Roxell authored
      commit fe663df7 upstream.
      
      Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
      2.37.90.20220207) the following build error shows up:
      
        {standard input}: Assembler messages:
        {standard input}:2088: Error: unrecognized opcode: `ptesync'
        make[3]: *** [/builds/linux/scripts/Makefile.build:287: arch/powerpc/lib/sstep.o] Error 1
      
      Add the 'ifdef CONFIG_PPC64' around the 'ptesync' in function
      'emulate_update_regs()' to like it is in 'analyse_instr()'. Since it looks like
      it got dropped inadvertently by commit 3cdfcbfd ("powerpc: Change
      analyse_instr so it doesn't modify *regs").
      
      A key detail is that analyse_instr() will never recognise lwsync or
      ptesync on 32-bit (because of the existing ifdef), and as a result
      emulate_update_regs() should never be called with an op specifying
      either of those on 32-bit. So removing them from emulate_update_regs()
      should be a nop in terms of runtime behaviour.
      
      Fixes: 3cdfcbfd
      
       ("powerpc: Change analyse_instr so it doesn't modify *regs")
      Cc: stable@vger.kernel.org # v4.14+
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      [mpe: Add last paragraph of change log mentioning analyse_instr() details]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20220211005113.1361436-1-anders.roxell@linaro.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b61859f
    • Christophe Leroy's avatar
      powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE · 9273f93c
      Christophe Leroy authored
      commit 9bb162fa upstream.
      
      Allthough kernel text is always mapped with BATs, we still have
      inittext mapped with pages, so TLB miss handling is required
      when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.
      
      The final solution should be to set a BAT that also maps inittext
      but that BAT then needs to be cleared at end of init, and it will
      require more changes to be able to do it properly.
      
      As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big
      deal so let's fix it simply for now to enable easy stable application.
      
      Fixes: 035b19a1
      
       ("powerpc/32s: Always map kernel text and rodata with BATs")
      Cc: stable@vger.kernel.org # v5.11+
      Reported-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/aea33b4813a26bdb9378b5f273f00bd5d4abe240.1638857364.git.christophe.leroy@csgroup.eu
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9273f93c
    • Woody Suwalski's avatar
      ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40 · 225e7bc4
      Woody Suwalski authored
      commit bfe55a1f upstream.
      
      Add and ACPI idle power level limit for 32-bit ThinkPad T40.
      
      There is a regression on T40 introduced by commit d6b88ce2, starting
      with kernel 5.16:
      
      commit d6b88ce2
      Author: Richard Gong <richard.gong@amd.com>
      Date:   Wed Sep 22 08:31:16 2021 -0500
      
        ACPI: processor idle: Allow playing dead in C3 state
      
      The above patch is trying to enter C3 state during init, what is causing
      a T40 system freeze. I have not found a similar issue on any other of my
      32-bit machines.
      
      The fix is to add another exception to the processor_power_dmi_table[] list.
      As a result the dmesg shows as expected:
      
      [2.155398] ACPI: IBM ThinkPad T40 detected - limiting to C2 max_cstate. Override with "processor.max_cstate=9"
      [2.155404] ACPI: processor limited to max C-state 2
      
      The fix is trivial and affects only vintage T40 systems.
      
      Fixes: d6b88ce2
      
       ("CPI: processor idle: Allow playing dead in C3 state")
      Signed-off-by: default avatarWoody Suwalski <wsuwalski@gmail.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
      [ rjw: New subject ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      225e7bc4
    • Steve French's avatar
      cifs: fix confusing unneeded warning message on smb2.1 and earlier · cd387fb7
      Steve French authored
      commit 53923e0f
      
       upstream.
      
      When mounting with SMB2.1 or earlier, even with nomultichannel, we
      log the confusing warning message:
        "CIFS: VFS: multichannel is not supported on this protocol version, use 3.0 or above"
      
      Fix this so that we don't log this unless they really are trying
      to mount with multichannel.
      
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215608
      Reported-by: default avatarKim Scarborough <kim@scarborough.kim>
      Cc: stable@vger.kernel.org # 5.11+
      Reviewed-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd387fb7
    • Amir Goldstein's avatar
      cifs: fix set of group SID via NTSD xattrs · 1a195b01
      Amir Goldstein authored
      commit dd5a927e upstream.
      
      'setcifsacl -g <SID>' silently fails to set the group SID on server.
      
      Actually, the bug existed since commit 438471b6 ("CIFS: Add support
      for setting owner info, dos attributes, and create time"), but this fix
      will not apply cleanly to kernel versions <= v5.10.
      
      Fixes: 3970acf7
      
       ("SMB3: Add support for getting and setting SACLs")
      Cc: stable@vger.kernel.org # 5.11+
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a195b01
    • Mark Brown's avatar
      ASoC: ops: Fix stereo change notifications in snd_soc_put_xr_sx() · a06d52d2
      Mark Brown authored
      commit 2b7c4636
      
       upstream.
      
      When writing out a stereo control we discard the change notification from
      the first channel, meaning that events are only generated based on changes
      to the second channel. Ensure that we report a change if either channel
      has changed.
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220201155629.120510-5-broonie@kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a06d52d2
    • Mark Brown's avatar
      ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_sx() · e8ee1a1b
      Mark Brown authored
      commit 7f3d90a3
      
       upstream.
      
      When writing out a stereo control we discard the change notification from
      the first channel, meaning that events are only generated based on changes
      to the second channel. Ensure that we report a change if either channel
      has changed.
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220201155629.120510-3-broonie@kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8ee1a1b
    • Mark Brown's avatar
      ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw_range() · cfeaa7ba
      Mark Brown authored
      commit 650204de
      
       upstream.
      
      When writing out a stereo control we discard the change notification from
      the first channel, meaning that events are only generated based on changes
      to the second channel. Ensure that we report a change if either channel
      has changed.
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220201155629.120510-4-broonie@kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cfeaa7ba
    • Mark Brown's avatar
      ASoC: ops: Fix stereo change notifications in snd_soc_put_volsw() · d163624f
      Mark Brown authored
      commit 564778d7
      
       upstream.
      
      When writing out a stereo control we discard the change notification from
      the first channel, meaning that events are only generated based on changes
      to the second channel. Ensure that we report a change if either channel
      has changed.
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220201155629.120510-2-broonie@kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d163624f
    • Takashi Iwai's avatar
      ALSA: hda: Fix missing codec probe on Shenker Dock 15 · 958fad1d
      Takashi Iwai authored
      commit dd8e5b16
      
       upstream.
      
      By some unknown reason, BIOS on Shenker Dock 15 doesn't set up the
      codec mask properly for the onboard audio.  Let's set the forced codec
      mask to enable the codec discovery.
      
      Reported-by: default avatar <dmummenschanz@web.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-1644345967148@3c-app-webde-bap22
      Link: https://lore.kernel.org/r/20220214100020.8870-2-tiwai@suse.de
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      958fad1d
    • Takashi Iwai's avatar
      ALSA: hda: Fix regression on forced probe mask option · d5d78e3d
      Takashi Iwai authored
      commit 6317f744 upstream.
      
      The forced probe mask via probe_mask 0x100 bit doesn't work any longer
      as expected since the bus init code was moved and it's clearing the
      codec_mask value that was set beforehand.  This patch fixes the
      long-time regression by moving the check_probe_mask() call.
      
      Fixes: a41d1224
      
       ("ALSA: hda - Embed bus into controller object")
      Reported-by: default avatar <dmummenschanz@web.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-1644345967148@3c-app-webde-bap22
      Link: https://lore.kernel.org/r/20220214100020.8870-1-tiwai@suse.de
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5d78e3d
    • Takashi Iwai's avatar
      ALSA: hda/realtek: Fix deadlock by COEF mutex · bda6c8f7
      Takashi Iwai authored
      commit 2a845837 upstream.
      
      The recently introduced coef_mutex for Realtek codec seems causing a
      deadlock when the relevant code is invoked from the power-off state;
      then the HD-audio core tries to power-up internally, and this kicks
      off the codec runtime PM code that tries to take the same coef_mutex.
      
      In order to avoid the deadlock, do the temporary power up/down around
      the coef_mutex acquisition and release.  This assures that the
      power-up sequence runs before the mutex, hence no re-entrance will
      happen.
      
      Fixes: b837a9f5
      
       ("ALSA: hda: realtek: Fix race at concurrent COEF updates")
      Reported-and-tested-by: default avatarJulian Wollrath <jwollrath@web.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220214132838.4db10fca@schienar
      Link: https://lore.kernel.org/r/20220214130410.21230-1-tiwai@suse.de
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bda6c8f7
    • Yu Huang's avatar
      ALSA: hda/realtek: Add quirk for Legion Y9000X 2019 · 861e505b
      Yu Huang authored
      commit c07f2c7b
      
       upstream.
      
      Legion Y9000X 2019 has the same speaker with Y9000X 2020,
      but with a different quirk address. Add one quirk entry
      to make the speaker work on Y9000X 2019 too.
      
      Signed-off-by: default avatarYu Huang <diwang90@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20220212160835.165065-1-diwang90@gmail.com
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      861e505b