Skip to content
  1. Jan 05, 2022
    • Xin Long's avatar
      sctp: use call_rcu to free endpoint · 769d14ab
      Xin Long authored
      [ Upstream commit 5ec7d18d
      
       ]
      
      This patch is to delay the endpoint free by calling call_rcu() to fix
      another use-after-free issue in sctp_sock_dump():
      
        BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
        Call Trace:
          __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
          lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
          __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
          _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
          spin_lock_bh include/linux/spinlock.h:334 [inline]
          __lock_sock+0x203/0x350 net/core/sock.c:2253
          lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
          lock_sock include/net/sock.h:1492 [inline]
          sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
          sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
          sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
          __inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
          inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
          netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
          __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
          netlink_dump_start include/linux/netlink.h:216 [inline]
          inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
          __sock_diag_cmd net/core/sock_diag.c:232 [inline]
          sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
          netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
          sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274
      
      This issue occurs when asoc is peeled off and the old sk is freed after
      getting it by asoc->base.sk and before calling lock_sock(sk).
      
      To prevent the sk free, as a holder of the sk, ep should be alive when
      calling lock_sock(). This patch uses call_rcu() and moves sock_put and
      ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
      hold the ep under rcu_read_lock in sctp_transport_traverse_process().
      
      If sctp_endpoint_hold() returns true, it means this ep is still alive
      and we have held it and can continue to dump it; If it returns false,
      it means this ep is dead and can be freed after rcu_read_unlock, and
      we should skip it.
      
      In sctp_sock_dump(), after locking the sk, if this ep is different from
      tsp->asoc->ep, it means during this dumping, this asoc was peeled off
      before calling lock_sock(), and the sk should be skipped; If this ep is
      the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
      and due to lock_sock, no peeloff will happen either until release_sock.
      
      Note that delaying endpoint free won't delay the port release, as the
      port release happens in sctp_endpoint_destroy() before calling call_rcu().
      Also, freeing endpoint by call_rcu() makes it safe to access the sk by
      asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().
      
      Thanks Jones to bring this issue up.
      
      v1->v2:
        - improve the changelog.
        - add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.
      
      Reported-by: default avatar <syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com>
      Reported-by: default avatarLee Jones <lee.jones@linaro.org>
      Fixes: d25adbeb
      
       ("sctp: fix an use-after-free issue in sctp_sock_dump")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      769d14ab
    • Coco Li's avatar
      selftests: Calculate udpgso segment count without header adjustment · 13c1bf43
      Coco Li authored
      [ Upstream commit 5471d522 ]
      
      The below referenced commit correctly updated the computation of number
      of segments (gso_size) by using only the gso payload size and
      removing the header lengths.
      
      With this change the regression test started failing. Update
      the tests to match this new behavior.
      
      Both IPv4 and IPv6 tests are updated, as a separate patch in this series
      will update udp_v6_send_skb to match this change in udp_send_skb.
      
      Fixes: 158390e4
      
       ("udp: using datalen to cap max gso segments")
      Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20211223222441.2975883-2-lixiaoyan@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      13c1bf43
    • Coco Li's avatar
      udp: using datalen to cap ipv6 udp max gso segments · abe74fb4
      Coco Li authored
      [ Upstream commit 736ef37f ]
      
      The max number of UDP gso segments is intended to cap to
      UDP_MAX_SEGMENTS, this is checked in udp_send_skb().
      
      skb->len contains network and transport header len here, we should use
      only data len instead.
      
      This is the ipv6 counterpart to the below referenced commit,
      which missed the ipv6 change
      
      Fixes: 158390e4
      
       ("udp: using datalen to cap max gso segments")
      Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20211223222441.2975883-1-lixiaoyan@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      abe74fb4
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Fix ICOSQ recovery flow for XSK · 5e6ad649
      Maxim Mikityanskiy authored
      [ Upstream commit 19c4aba2 ]
      
      There are two ICOSQs per channel: one is needed for RX, and the other
      for async operations (XSK TX, kTLS offload). Currently, the recovery
      flow for both is the same, and async ICOSQ is mistakenly treated like
      the regular ICOSQ.
      
      This patch prevents running the regular ICOSQ recovery on async ICOSQ.
      The purpose of async ICOSQ is to handle XSK wakeup requests and post
      kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
      so the regular recovery sequence is not applicable here.
      
      Fixes: be5323c8
      
       ("net/mlx5e: Report and recover from CQE error on ICOSQ")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarAya Levin <ayal@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5e6ad649
    • Amir Tzin's avatar
      net/mlx5e: Wrap the tx reporter dump callback to extract the sq · 73665165
      Amir Tzin authored
      [ Upstream commit 918fc385 ]
      
      Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
      mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
      of type struct mlx5e_tx_timeout_ctx *.
      
       mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
       mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
       BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
       kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
       CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
       RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
       [mlx5_core]
       Call Trace:
       mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
       devlink_health_do_dump.part.91+0x71/0xd0
       devlink_health_report+0x157/0x1b0
       mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
       ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
       [mlx5_core]
       ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
       ? update_load_avg+0x19b/0x550
       ? set_next_entity+0x72/0x80
       ? pick_next_task_fair+0x227/0x340
       ? finish_task_switch+0xa2/0x280
         mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
         process_one_work+0x1de/0x3a0
         worker_thread+0x2d/0x3c0
       ? process_one_work+0x3a0/0x3a0
         kthread+0x115/0x130
       ? kthread_park+0x90/0x90
         ret_from_fork+0x1f/0x30
       --[ end trace 51ccabea504edaff ]---
       RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
       PKRU: 55555554
       Kernel panic - not syncing: Fatal exception
       Kernel Offset: disabled
       end Kernel panic - not syncing: Fatal exception
      
      To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
      extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
      TX-timeout-recovery flow dump callback.
      
      Fixes: 5f29458b
      
       ("net/mlx5e: Support dump callback in TX reporter")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Signed-off-by: default avatarAmir Tzin <amirtz@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      73665165
    • Miaoqian Lin's avatar
      net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources · 4cd1da02
      Miaoqian Lin authored
      [ Upstream commit 6b8b4258 ]
      
      The mlx5_get_uars_page() function  returns error pointers.
      Using IS_ERR() to check the return value to fix this.
      
      Fixes: 4ec9e7b0
      
       ("net/mlx5: DR, Expose steering domain functionality")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4cd1da02
    • Dan Carpenter's avatar
      scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write() · fcb32eb3
      Dan Carpenter authored
      [ Upstream commit 9020be11 ]
      
      The "mybuf" string comes from the user, so we need to ensure that it is NUL
      terminated.
      
      Link: https://lore.kernel.org/r/20211214070527.GA27934@kili
      Fixes: bd2cdd5e
      
       ("scsi: lpfc: NVME Initiator: Add debugfs support")
      Reviewed-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fcb32eb3
    • Tom Rix's avatar
      selinux: initialize proto variable in selinux_ip_postroute_compat() · 4833ad49
      Tom Rix authored
      commit 732bc2ff upstream.
      
      Clang static analysis reports this warning
      
      hooks.c:5765:6: warning: 4th function call argument is an uninitialized
                      value
              if (selinux_xfrm_postroute_last(sksec->sid, skb, &ad, proto))
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      selinux_parse_skb() can return ok without setting proto.  The later call
      to selinux_xfrm_postroute_last() does an early check of proto and can
      return ok if the garbage proto value matches.  So initialize proto.
      
      Cc: stable@vger.kernel.org
      Fixes: eef9b416
      
       ("selinux: cleanup selinux_xfrm_sock_rcv_skb() and selinux_xfrm_postroute_last()")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      [PM: typo/spelling and checkpatch.pl description fixes]
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4833ad49
    • Heiko Carstens's avatar
      recordmcount.pl: fix typo in s390 mcount regex · ec941a22
      Heiko Carstens authored
      commit 4eb1782e upstream.
      
      Commit 85bf17b2
      
       ("recordmcount.pl: look for jgnop instruction as well
      as bcrl on s390") added a new alternative mnemonic for the existing brcl
      instruction. This is required for the combination old gcc version (pre 9.0)
      and binutils since version 2.37.
      However at the same time this commit introduced a typo, replacing brcl with
      bcrl. As a result no mcount locations are detected anymore with old gcc
      versions (pre 9.0) and binutils before version 2.37.
      Fix this by using the correct mnemonic again.
      
      Reported-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 85bf17b2
      
       ("recordmcount.pl: look for jgnop instruction as well as bcrl on s390")
      Link: https://lore.kernel.org/r/alpine.LSU.2.21.2112230949520.19849@pobox.suse.cz
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec941a22
    • Jackie Liu's avatar
      memblock: fix memblock_phys_alloc() section mismatch error · a0e82d5e
      Jackie Liu authored
      [ Upstream commit d7f55471
      
       ]
      
      Fix modpost Section mismatch error in memblock_phys_alloc()
      
      [...]
      WARNING: modpost: vmlinux.o(.text.unlikely+0x1dcc): Section mismatch in reference
      from the function memblock_phys_alloc() to the function .init.text:memblock_phys_alloc_range()
      The function memblock_phys_alloc() references
      the function __init memblock_phys_alloc_range().
      This is often because memblock_phys_alloc lacks a __init
      annotation or the annotation of memblock_phys_alloc_range is wrong.
      
      ERROR: modpost: Section mismatches detected.
      Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.
      [...]
      
      memblock_phys_alloc() is a one-line wrapper, make it __always_inline to
      avoid these section mismatches.
      
      Reported-by: default avatark2ci <kernel-bot@kylinos.cn>
      Suggested-by: default avatarMike Rapoport <rppt@kernel.org>
      Signed-off-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      [rppt: slightly massaged changelog ]
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Link: https://lore.kernel.org/r/20211217020754.2874872-1-liu.yun@linux.dev
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0e82d5e
    • Wang Qing's avatar
      platform/x86: apple-gmux: use resource_size() with res · 7da855e9
      Wang Qing authored
      [ Upstream commit eb66fb03
      
       ]
      
      This should be (res->end - res->start + 1) here actually,
      use resource_size() derectly.
      
      Signed-off-by: default avatarWang Qing <wangqing@vivo.com>
      Link: https://lore.kernel.org/r/1639484316-75873-1-git-send-email-wangqing@vivo.com
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7da855e9
    • Helge Deller's avatar
      parisc: Clear stale IIR value on instruction access rights trap · d01e9ce1
      Helge Deller authored
      [ Upstream commit 484730e5
      
       ]
      
      When a trap 7 (Instruction access rights) occurs, this means the CPU
      couldn't execute an instruction due to missing execute permissions on
      the memory region.  In this case it seems the CPU didn't even fetched
      the instruction from memory and thus did not store it in the cr19 (IIR)
      register before calling the trap handler. So, the trap handler will find
      some random old stale value in cr19.
      
      This patch simply overwrites the stale IIR value with a constant magic
      "bad food" value (0xbaadf00d), in the hope people don't start to try to
      understand the various random IIR values in trap 7 dumps.
      
      Noticed-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d01e9ce1
    • Tetsuo Handa's avatar
      tomoyo: use hwight16() in tomoyo_domain_quota_is_ok() · 0643d917
      Tetsuo Handa authored
      [ Upstream commit f702e110
      
       ]
      
      hwight16() is much faster. While we are at it, no need to include
      "perm =" part into data_race() macro, for perm is a local variable
      that cannot be accessed by other threads.
      
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0643d917
    • Dmitry Vyukov's avatar
      tomoyo: Check exceeded quota early in tomoyo_domain_quota_is_ok(). · e2048a1f
      Dmitry Vyukov authored
      [ Upstream commit 04e57a2d
      
       ]
      
      If tomoyo is used in a testing/fuzzing environment in learning mode,
      for lots of domains the quota will be exceeded and stay exceeded
      for prolonged periods of time. In such cases it's pointless (and slow)
      to walk the whole acl list again and again just to rediscover that
      the quota is exceeded. We already have the TOMOYO_DIF_QUOTA_WARNED flag
      that notes the overflow condition. Check it early to avoid the slowdown.
      
      [penguin-kernel]
      This patch causes a user visible change that the learning mode will not be
      automatically resumed after the quota is increased. To resume the learning
      mode, administrator will need to explicitly clear TOMOYO_DIF_QUOTA_WARNED
      flag after increasing the quota. But I think that this change is generally
      preferable, for administrator likely wants to optimize the acl list for
      that domain before increasing the quota, or that domain likely hits the
      quota again. Therefore, don't try to care to clear TOMOYO_DIF_QUOTA_WARNED
      flag automatically when the quota for that domain changed.
      
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e2048a1f
    • Samuel Čavoj's avatar
      Input: i8042 - enable deferred probe quirk for ASUS UM325UA · 210c7c69
      Samuel Čavoj authored
      [ Upstream commit 44ee250a
      
       ]
      
      The ASUS UM325UA suffers from the same issue as the ASUS UX425UA, which
      is a very similar laptop. The i8042 device is not usable immediately
      after boot and fails to initialize, requiring a deferred retry.
      
      Enable the deferred probe quirk for the UM325UA.
      
      BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1190256
      Signed-off-by: default avatarSamuel Čavoj <samuel@cavoj.net>
      Link: https://lore.kernel.org/r/20211204015615.232948-1-samuel@cavoj.net
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      210c7c69
    • Takashi Iwai's avatar
      Input: i8042 - add deferred probe support · bb672eff
      Takashi Iwai authored
      [ Upstream commit 9222ba68
      
       ]
      
      We've got a bug report about the non-working keyboard on ASUS ZenBook
      UX425UA.  It seems that the PS/2 device isn't ready immediately at
      boot but takes some seconds to get ready.  Until now, the only
      workaround is to defer the probe, but it's available only when the
      driver is a module.  However, many distros, including openSUSE as in
      the original report, build the PS/2 input drivers into kernel, hence
      it won't work easily.
      
      This patch adds the support for the deferred probe for i8042 stuff as
      a workaround of the problem above.  When the deferred probe mode is
      enabled and the device couldn't be probed, it'll be repeated with the
      standard deferred probe mechanism.
      
      The deferred probe mode is enabled either via the new option
      i8042.probe_defer or via the quirk table entry.  As of this patch, the
      quirk table contains only ASUS ZenBook UX425UA.
      
      The deferred probe part is based on Fabio's initial work.
      
      BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1190256
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarSamuel Čavoj <samuel@cavoj.net>
      Link: https://lore.kernel.org/r/20211117063757.11380-1-tiwai@suse.de
      
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bb672eff
  2. Dec 29, 2021