Skip to content
  1. Jan 18, 2023
    • Eran Ben Elisha's avatar
      net/mlx5: Rename ptp clock info · 62d707da
      Eran Ben Elisha authored
      [ Upstream commit aac2df7f
      
       ]
      
      Fix a typo in ptp_clock_info naming: mlx5_p2p -> mlx5_ptp.
      
      Signed-off-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Stable-dep-of: fe91d572
      
       ("net/mlx5: Fix ptp max frequency adjustment range")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      62d707da
    • Ido Schimmel's avatar
      net/sched: act_mpls: Fix warning during failed attribute validation · 2b157c3c
      Ido Schimmel authored
      [ Upstream commit 9e17f992 ]
      
      The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a
      validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid
      combination according to the comment above 'struct nla_policy':
      
      "
      Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN:
         NLA_BINARY           Validation function called for the attribute.
         All other            Unused - but note that it's a union
      "
      
      This can trigger the warning [1] in nla_get_range_unsigned() when
      validation of the attribute fails. Despite being of 'NLA_U32' type, the
      associated 'min'/'max' fields in the policy are negative as they are
      aliased by the 'validate' field.
      
      Fix by changing the attribute type to 'NLA_BINARY' which is consistent
      with the above comment and all other users of NLA_POLICY_VALIDATE_FN().
      As a result, move the length validation to the validation function.
      
      No regressions in MPLS tests:
      
       # ./tdc.py -f tc-tests/actions/mpls.json
       [...]
       # echo $?
       0
      
      [1]
      WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118
      nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
      Modules linked in:
      CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
      RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
      [...]
      Call Trace:
       <TASK>
       __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310
       netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411
       netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline]
       netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506
       netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546
       rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg net/socket.c:734 [inline]
       ____sys_sendmsg+0x38f/0x500 net/socket.c:2482
       ___sys_sendmsg net/socket.c:2536 [inline]
       __sys_sendmsg+0x197/0x230 net/socket.c:2565
       __do_sys_sendmsg net/socket.c:2574 [inline]
       __se_sys_sendmsg net/socket.c:2572 [inline]
       __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/
      Fixes: 2a2ea508
      
       ("net: sched: add mpls manipulation actions to TC")
      Reported-by: default avatarWei Chen <harperchen1110@gmail.com>
      Tested-by: default avatarWei Chen <harperchen1110@gmail.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2b157c3c
    • Minsuk Kang's avatar
      nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame() · 9424d220
      Minsuk Kang authored
      [ Upstream commit 9dab880d ]
      
      Fix a use-after-free that occurs in hcd when in_urb sent from
      pn533_usb_send_frame() is completed earlier than out_urb. Its callback
      frees the skb data in pn533_send_async_complete() that is used as a
      transfer buffer of out_urb. Wait before sending in_urb until the
      callback of out_urb is called. To modify the callback of out_urb alone,
      separate the complete function of out_urb and ack_urb.
      
      Found by a modified version of syzkaller.
      
      BUG: KASAN: use-after-free in dummy_timer
      Call Trace:
       memcpy (mm/kasan/shadow.c:65)
       dummy_perform_transfer (drivers/usb/gadget/udc/dummy_hcd.c:1352)
       transfer (drivers/usb/gadget/udc/dummy_hcd.c:1453)
       dummy_timer (drivers/usb/gadget/udc/dummy_hcd.c:1972)
       arch_static_branch (arch/x86/include/asm/jump_label.h:27)
       static_key_false (include/linux/jump_label.h:207)
       timer_expire_exit (include/trace/events/timer.h:127)
       call_timer_fn (kernel/time/timer.c:1475)
       expire_timers (kernel/time/timer.c:1519)
       __run_timers (kernel/time/timer.c:1790)
       run_timer_softirq (kernel/time/timer.c:1803)
      
      Fixes: c46ee386
      
       ("NFC: pn533: add NXP pn533 nfc device driver")
      Signed-off-by: default avatarMinsuk Kang <linuxlovemin@yonsei.ac.kr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9424d220
    • Roger Pau Monne's avatar
      hvc/xen: lock console list traversal · 576eadef
      Roger Pau Monne authored
      [ Upstream commit c0dccad8 ]
      
      The currently lockless access to the xen console list in
      vtermno_to_xencons() is incorrect, as additions and removals from the
      list can happen anytime, and as such the traversal of the list to get
      the private console data for a given termno needs to happen with the
      lock held.  Note users that modify the list already do so with the
      lock taken.
      
      Adjust current lock takers to use the _irq{save,restore} helpers,
      since the context in which vtermno_to_xencons() is called can have
      interrupts disabled.  Use the _irq{save,restore} set of helpers to
      switch the current callers to disable interrupts in the locked region.
      I haven't checked if existing users could instead use the _irq
      variant, as I think it's safer to use _irq{save,restore} upfront.
      
      While there switch from using list_for_each_entry_safe to
      list_for_each_entry: the current entry cursor won't be removed as
      part of the code in the loop body, so using the _safe variant is
      pointless.
      
      Fixes: 02e19f9c
      
       ('hvc_xen: implement multiconsole support')
      Signed-off-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/20221130163611.14686-1-roger.pau@citrix.com
      
      
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      576eadef
    • Tung Nguyen's avatar
      tipc: fix unexpected link reset due to discovery messages · 7d04fe15
      Tung Nguyen authored
      [ Upstream commit c244c092 ]
      
      This unexpected behavior is observed:
      
      node 1                    | node 2
      ------                    | ------
      link is established       | link is established
      reboot                    | link is reset
      up                        | send discovery message
      receive discovery message |
      link is established       | link is established
      send discovery message    |
                                | receive discovery message
                                | link is reset (unexpected)
                                | send reset message
      link is reset             |
      
      It is due to delayed re-discovery as described in function
      tipc_node_check_dest(): "this link endpoint has already reset
      and re-established contact with the peer, before receiving a
      discovery message from that node."
      
      However, commit 598411d7 has changed the condition for calling
      tipc_node_link_down() which was the acceptance of new media address.
      
      This commit fixes this by restoring the old and correct behavior.
      
      Fixes: 598411d7
      
       ("tipc: make resetting of links non-atomic")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d04fe15
    • Hoang Le's avatar
      tipc: eliminate checking netns if node established · 95b2d488
      Hoang Le authored
      [ Upstream commit d408bef4 ]
      
      Currently, we scan over all network namespaces at each received
      discovery message in order to check if the sending peer might be
      present in a host local namespaces.
      
      This is unnecessary since we can assume that a peer will not change its
      location during an established session.
      
      We now improve the condition for this testing so that we don't perform
      any redundant scans.
      
      Fixes: f73b1281
      
       ("tipc: improve throughput between nodes in netns")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: c244c092
      
       ("tipc: fix unexpected link reset due to discovery messages")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95b2d488
    • Hoang Le's avatar
      tipc: improve throughput between nodes in netns · d6418829
      Hoang Le authored
      [ Upstream commit f73b1281 ]
      
      Currently, TIPC transports intra-node user data messages directly
      socket to socket, hence shortcutting all the lower layers of the
      communication stack. This gives TIPC very good intra node performance,
      both regarding throughput and latency.
      
      We now introduce a similar mechanism for TIPC data traffic across
      network namespaces located in the same kernel. On the send path, the
      call chain is as always accompanied by the sending node's network name
      space pointer. However, once we have reliably established that the
      receiving node is represented by a namespace on the same host, we just
      replace the namespace pointer with the receiving node/namespace's
      ditto, and follow the regular socket receive patch though the receiving
      node. This technique gives us a throughput similar to the node internal
      throughput, several times larger than if we let the traffic go though
      the full network stacks. As a comparison, max throughput for 64k
      messages is four times larger than TCP throughput for the same type of
      traffic.
      
      To meet any security concerns, the following should be noted.
      
      - All nodes joining a cluster are supposed to have been be certified
      and authenticated by mechanisms outside TIPC. This is no different for
      nodes/namespaces on the same host; they have to auto discover each
      other using the attached interfaces, and establish links which are
      supervised via the regular link monitoring mechanism. Hence, a kernel
      local node has no other way to join a cluster than any other node, and
      have to obey to policies set in the IP or device layers of the stack.
      
      - Only when a sender has established with 100% certainty that the peer
      node is located in a kernel local namespace does it choose to let user
      data messages, and only those, take the crossover path to the receiving
      node/namespace.
      
      - If the receiving node/namespace is removed, its namespace pointer
      is invalidated at all peer nodes, and their neighbor link monitoring
      will eventually note that this node is gone.
      
      - To ensure the "100% certainty" criteria, and prevent any possible
      spoofing, received discovery messages must contain a proof that the
      sender knows a common secret. We use the hash mix of the sending
      node/namespace for this purpose, since it can be accessed directly by
      all other namespaces in the kernel. Upon reception of a discovery
      message, the receiver checks this proof against all the local
      namespaces'hash_mix:es. If it finds a match, that, along with a
      matching node id and cluster id, this is deemed sufficient proof that
      the peer node in question is in a local namespace, and a wormhole can
      be opened.
      
      - We should also consider that TIPC is intended to be a cluster local
      IPC mechanism (just like e.g. UNIX sockets) rather than a network
      protocol, and hence we think it can justified to allow it to shortcut the
      lower protocol layers.
      
      Regarding traceability, we should notice that since commit 6c9081a3
      
      
      ("tipc: add loopback device tracking") it is possible to follow the node
      internal packet flow by just activating tcpdump on the loopback
      interface. This will be true even for this mechanism; by activating
      tcpdump on the involved nodes' loopback interfaces their inter-name
      space messaging can easily be tracked.
      
      v2:
      - update 'net' pointer when node left/rejoined
      v3:
      - grab read/write lock when using node ref obj
      v4:
      - clone traffics between netns to loopback
      
      Suggested-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: c244c092
      
       ("tipc: fix unexpected link reset due to discovery messages")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d6418829
    • Ricardo Ribalda's avatar
      regulator: da9211: Use irq handler when ready · d443308e
      Ricardo Ribalda authored
      [ Upstream commit 02228f6a
      
       ]
      
      If the system does not come from reset (like when it is kexec()), the
      regulator might have an IRQ waiting for us.
      
      If we enable the IRQ handler before its structures are ready, we crash.
      
      This patch fixes:
      
      [    1.141839] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
      [    1.316096] Call trace:
      [    1.316101]  blocking_notifier_call_chain+0x20/0xa8
      [    1.322757] cpu cpu0: dummy supplies not allowed for exclusive requests
      [    1.327823]  regulator_notifier_call_chain+0x1c/0x2c
      [    1.327825]  da9211_irq_handler+0x68/0xf8
      [    1.327829]  irq_thread+0x11c/0x234
      [    1.327833]  kthread+0x13c/0x154
      
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Reviewed-by: default avatarAdam Ward <DLG-Adam.Ward.opensource@dm.renesas.com>
      Link: https://lore.kernel.org/r/20221124-da9211-v2-0-1779e3c5d491@chromium.org
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d443308e
    • Eliav Farber's avatar
      EDAC/device: Fix period calculation in edac_device_reset_delay_period() · 43f48e6c
      Eliav Farber authored
      commit e8407743 upstream.
      
      Fix period calculation in case user sets a value of 1000.  The input of
      round_jiffies_relative() should be in jiffies and not in milli-seconds.
      
        [ bp: Use the same code pattern as in edac_device_workq_setup() for
          clarity. ]
      
      Fixes: c4cf3b45
      
       ("EDAC: Rework workqueue handling")
      Signed-off-by: default avatarEliav Farber <farbere@amazon.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Cc: <stable@kernel.org>
      Link: https://lore.kernel.org/r/20221020124458.22153-1-farbere@amazon.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43f48e6c
    • Peter Zijlstra's avatar
      x86/boot: Avoid using Intel mnemonics in AT&T syntax asm · a5b73762
      Peter Zijlstra authored
      commit 7c6dd961 upstream.
      
      With 'GNU assembler (GNU Binutils for Debian) 2.39.90.20221231' the
      build now reports:
      
        arch/x86/realmode/rm/../../boot/bioscall.S: Assembler messages:
        arch/x86/realmode/rm/../../boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
        arch/x86/realmode/rm/../../boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
      
        arch/x86/boot/bioscall.S: Assembler messages:
        arch/x86/boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
        arch/x86/boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
      
      Which is due to:
      
        PR gas/29525
      
        Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
        templates taking operands, mixed IsString/non-IsString template groups
        (with memory operands) cannot occur anymore. With that
        maybe_adjust_templates() becomes unnecessary (and is hence being
        removed).
      
      More details: https://sourceware.org/bugzilla/show_bug.cgi?id=29525
      
      Borislav Petkov further explains:
      
        " the particular problem here is is that the 'd' suffix is
          "conflicting" in the sense that you can have SSE mnemonics like movsD %xmm...
          and the same thing also for string ops (which is the case here) so apparently
          the agreement in binutils land is to use the always accepted suffixes 'l' or 'q'
          and phase out 'd' slowly... "
      
      Fixes: 7a734e7d
      
       ("x86, setup: "glove box" BIOS calls -- infrastructure")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/Y71I3Ex2pvIxMpsP@hirez.programming.kicks-ass.net
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5b73762
    • Kajol Jain's avatar
      powerpc/imc-pmu: Fix use of mutex in IRQs disabled section · d0c6d2a3
      Kajol Jain authored
      commit 76d588dd upstream.
      
      Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP
      and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event.
      
      Command to trigger the warning:
        # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5
      
         Performance counter stats for 'sleep 5':
      
                         0      thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/
      
               5.002117947 seconds time elapsed
      
               0.000131000 seconds user
               0.001063000 seconds sys
      
      Below is snippet of the warning in dmesg:
      
        BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
        in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec
        preempt_count: 2, expected: 0
        4 locks held by perf-exec/2869:
         #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90
         #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0
         #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510
         #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510
        irq event stamp: 4806
        hardirqs last  enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0
        hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510
        softirqs last  enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0
        softirqs last disabled at (0): [<0000000000000000>] 0x0
        CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        Call Trace:
          dump_stack_lvl+0x98/0xe0 (unreliable)
          __might_resched+0x2f8/0x310
          __mutex_lock+0x6c/0x13f0
          thread_imc_event_add+0xf4/0x1b0
          event_sched_in+0xe0/0x210
          merge_sched_in+0x1f0/0x600
          visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0
          ctx_flexible_sched_in+0xcc/0x140
          ctx_sched_in+0x20c/0x2a0
          ctx_resched+0x104/0x1c0
          perf_event_exec+0x340/0x510
          begin_new_exec+0x730/0xef0
          load_elf_binary+0x3f8/0x1e10
        ...
        do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0
        WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0
        CPU: 36 PID: 2869 Comm: sleep Tainted: G        W          6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        NIP:  c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670
        REGS: c00000004d2134e0 TRAP: 0700   Tainted: G        W           (6.2.0-rc2-00011-g1247637727f2)
        MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48002824  XER: 00000000
        CFAR: c00000000013fb64 IRQMASK: 1
      
      The above warning triggered because the current imc-pmu code uses mutex
      lock in interrupt disabled sections. The function mutex_lock()
      internally calls __might_resched(), which will check if IRQs are
      disabled and in case IRQs are disabled, it will trigger the warning.
      
      Fix the issue by changing the mutex lock to spinlock.
      
      Fixes: 8f95faaa
      
       ("powerpc/powernv: Detect and create IMC device")
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      [mpe: Fix comments, trim oops in change log, add reported-by tags]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230106065157.182648-1-kjain@linux.ibm.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0c6d2a3
    • Gavrilov Ilia's avatar
      netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function. · feefb33e
      Gavrilov Ilia authored
      commit 9ea4b476 upstream.
      
      When first_ip is 0, last_ip is 0xFFFFFFFF, and netmask is 31, the value of
      an arithmetic expression 2 << (netmask - mask_bits - 1) is subject
      to overflow due to a failure casting operands to a larger data type
      before performing the arithmetic.
      
      Note that it's harmless since the value will be checked at the next step.
      
      Found by InfoTeCS on behalf of Linux Verification Center
      (linuxtesting.org) with SVACE.
      
      Fixes: b9fed748
      
       ("netfilter: ipset: Check and reject crazy /0 input parameters")
      Signed-off-by: default avatarIlia.Gavrilov <Ilia.Gavrilov@infotecs.ru>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      feefb33e
    • Ye Bin's avatar
      ext4: fix uninititialized value in 'ext4_evict_inode' · e431b4fb
      Ye Bin authored
      [ Upstream commit 7ea71af9 ]
      
      Syzbot found the following issue:
      =====================================================
      BUG: KMSAN: uninit-value in ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180
       ext4_evict_inode+0xdd/0x26b0 fs/ext4/inode.c:180
       evict+0x365/0x9a0 fs/inode.c:664
       iput_final fs/inode.c:1747 [inline]
       iput+0x985/0xdd0 fs/inode.c:1773
       __ext4_new_inode+0xe54/0x7ec0 fs/ext4/ialloc.c:1361
       ext4_mknod+0x376/0x840 fs/ext4/namei.c:2844
       vfs_mknod+0x79d/0x830 fs/namei.c:3914
       do_mknodat+0x47d/0xaa0
       __do_sys_mknodat fs/namei.c:3992 [inline]
       __se_sys_mknodat fs/namei.c:3989 [inline]
       __ia32_sys_mknodat+0xeb/0x150 fs/namei.c:3989
       do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
       __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
       do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
       do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246
       entry_SYSENTER_compat_after_hwframe+0x70/0x82
      
      Uninit was created at:
       __alloc_pages+0x9f1/0xe80 mm/page_alloc.c:5578
       alloc_pages+0xaae/0xd80 mm/mempolicy.c:2285
       alloc_slab_page mm/slub.c:1794 [inline]
       allocate_slab+0x1b5/0x1010 mm/slub.c:1939
       new_slab mm/slub.c:1992 [inline]
       ___slab_alloc+0x10c3/0x2d60 mm/slub.c:3180
       __slab_alloc mm/slub.c:3279 [inline]
       slab_alloc_node mm/slub.c:3364 [inline]
       slab_alloc mm/slub.c:3406 [inline]
       __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
       kmem_cache_alloc_lru+0x6f3/0xb30 mm/slub.c:3429
       alloc_inode_sb include/linux/fs.h:3117 [inline]
       ext4_alloc_inode+0x5f/0x860 fs/ext4/super.c:1321
       alloc_inode+0x83/0x440 fs/inode.c:259
       new_inode_pseudo fs/inode.c:1018 [inline]
       new_inode+0x3b/0x430 fs/inode.c:1046
       __ext4_new_inode+0x2a7/0x7ec0 fs/ext4/ialloc.c:959
       ext4_mkdir+0x4d5/0x1560 fs/ext4/namei.c:2992
       vfs_mkdir+0x62a/0x870 fs/namei.c:4035
       do_mkdirat+0x466/0x7b0 fs/namei.c:4060
       __do_sys_mkdirat fs/namei.c:4075 [inline]
       __se_sys_mkdirat fs/namei.c:4073 [inline]
       __ia32_sys_mkdirat+0xc4/0x120 fs/namei.c:4073
       do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
       __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
       do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
       do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:246
       entry_SYSENTER_compat_after_hwframe+0x70/0x82
      
      CPU: 1 PID: 4625 Comm: syz-executor.2 Not tainted 6.1.0-rc4-syzkaller-62821-gcb231e2f67ec #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
      =====================================================
      
      Now, 'ext4_alloc_inode()' didn't init 'ei->i_flags'. If new inode failed
      before set 'ei->i_flags' in '__ext4_new_inode()', then do 'iput()'. As after
      6bc0d63d
      
       commit will access 'ei->i_flags' in 'ext4_evict_inode()' which
      will lead to access uninit-value.
      To solve above issue just init 'ei->i_flags' in 'ext4_alloc_inode()'.
      
      Reported-by: default avatar <syzbot+57b25da729eb0b88177d@syzkaller.appspotmail.com>
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Fixes: 6bc0d63d
      
       ("ext4: remove EA inode entry from mbcache on inode eviction")
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Link: https://lore.kernel.org/r/20221117073603.2598882-1-yebin@huaweicloud.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e431b4fb
    • Baokun Li's avatar
      ext4: fix use-after-free in ext4_orphan_cleanup · 026a4490
      Baokun Li authored
      [ Upstream commit a71248b1
      
       ]
      
      I caught a issue as follows:
      ==================================================================
       BUG: KASAN: use-after-free in __list_add_valid+0x28/0x1a0
       Read of size 8 at addr ffff88814b13f378 by task mount/710
      
       CPU: 1 PID: 710 Comm: mount Not tainted 6.1.0-rc3-next #370
       Call Trace:
        <TASK>
        dump_stack_lvl+0x73/0x9f
        print_report+0x25d/0x759
        kasan_report+0xc0/0x120
        __asan_load8+0x99/0x140
        __list_add_valid+0x28/0x1a0
        ext4_orphan_cleanup+0x564/0x9d0 [ext4]
        __ext4_fill_super+0x48e2/0x5300 [ext4]
        ext4_fill_super+0x19f/0x3a0 [ext4]
        get_tree_bdev+0x27b/0x450
        ext4_get_tree+0x19/0x30 [ext4]
        vfs_get_tree+0x49/0x150
        path_mount+0xaae/0x1350
        do_mount+0xe2/0x110
        __x64_sys_mount+0xf0/0x190
        do_syscall_64+0x35/0x80
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
        </TASK>
       [...]
      ==================================================================
      
      Above issue may happen as follows:
      -------------------------------------
      ext4_fill_super
        ext4_orphan_cleanup
         --- loop1: assume last_orphan is 12 ---
          list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan)
          ext4_truncate --> return 0
            ext4_inode_attach_jinode --> return -ENOMEM
          iput(inode) --> free inode<12>
         --- loop2: last_orphan is still 12 ---
          list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan);
          // use inode<12> and trigger UAF
      
      To solve this issue, we need to propagate the return value of
      ext4_inode_attach_jinode() appropriately.
      
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20221102080633.1630225-1-libaokun1@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      026a4490
    • zhengliang's avatar
      ext4: lost matching-pair of trace in ext4_truncate · fa41a133
      zhengliang authored
      [ Upstream commit 9a5d265f
      
       ]
      
      It should call trace exit in all return path for ext4_truncate.
      
      Signed-off-by: default avatarzhengliang <zhengliang6@huawei.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Reviewed-by: default avatarRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20200701083027.45996-1-zhengliang6@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Stable-dep-of: a71248b1
      
       ("ext4: fix use-after-free in ext4_orphan_cleanup")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fa41a133
    • Baokun Li's avatar
      ext4: fix bug_on in __es_tree_search caused by bad quota inode · 1d552483
      Baokun Li authored
      [ Upstream commit d3238774
      
       ]
      
      We got a issue as fllows:
      ==================================================================
       kernel BUG at fs/ext4/extents_status.c:202!
       invalid opcode: 0000 [#1] PREEMPT SMP
       CPU: 1 PID: 810 Comm: mount Not tainted 6.1.0-rc1-next-g9631525255e3 #352
       RIP: 0010:__es_tree_search.isra.0+0xb8/0xe0
       RSP: 0018:ffffc90001227900 EFLAGS: 00010202
       RAX: 0000000000000000 RBX: 0000000077512a0f RCX: 0000000000000000
       RDX: 0000000000000002 RSI: 0000000000002a10 RDI: ffff8881004cd0c8
       RBP: ffff888177512ac8 R08: 47ffffffffffffff R09: 0000000000000001
       R10: 0000000000000001 R11: 00000000000679af R12: 0000000000002a10
       R13: ffff888177512d88 R14: 0000000077512a10 R15: 0000000000000000
       FS: 00007f4bd76dbc40(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00005653bf993cf8 CR3: 000000017bfdf000 CR4: 00000000000006e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        <TASK>
        ext4_es_cache_extent+0xe2/0x210
        ext4_cache_extents+0xd2/0x110
        ext4_find_extent+0x5d5/0x8c0
        ext4_ext_map_blocks+0x9c/0x1d30
        ext4_map_blocks+0x431/0xa50
        ext4_getblk+0x82/0x340
        ext4_bread+0x14/0x110
        ext4_quota_read+0xf0/0x180
        v2_read_header+0x24/0x90
        v2_check_quota_file+0x2f/0xa0
        dquot_load_quota_sb+0x26c/0x760
        dquot_load_quota_inode+0xa5/0x190
        ext4_enable_quotas+0x14c/0x300
        __ext4_fill_super+0x31cc/0x32c0
        ext4_fill_super+0x115/0x2d0
        get_tree_bdev+0x1d2/0x360
        ext4_get_tree+0x19/0x30
        vfs_get_tree+0x26/0xe0
        path_mount+0x81d/0xfc0
        do_mount+0x8d/0xc0
        __x64_sys_mount+0xc0/0x160
        do_syscall_64+0x35/0x80
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
        </TASK>
      ==================================================================
      
      Above issue may happen as follows:
      -------------------------------------
      ext4_fill_super
       ext4_orphan_cleanup
        ext4_enable_quotas
         ext4_quota_enable
          ext4_iget --> get error inode <5>
           ext4_ext_check_inode --> Wrong imode makes it escape inspection
           make_bad_inode(inode) --> EXT4_BOOT_LOADER_INO set imode
          dquot_load_quota_inode
           vfs_setup_quota_inode --> check pass
           dquot_load_quota_sb
            v2_check_quota_file
             v2_read_header
              ext4_quota_read
               ext4_bread
                ext4_getblk
                 ext4_map_blocks
                  ext4_ext_map_blocks
                   ext4_find_extent
                    ext4_cache_extents
                     ext4_es_cache_extent
                      __es_tree_search.isra.0
                       ext4_es_end --> Wrong extents trigger BUG_ON
      
      In the above issue, s_usr_quota_inum is set to 5, but inode<5> contains
      incorrect imode and disordered extents. Because 5 is EXT4_BOOT_LOADER_INO,
      the ext4_ext_check_inode check in the ext4_iget function can be bypassed,
      finally, the extents that are not checked trigger the BUG_ON in the
      __es_tree_search function. To solve this issue, check whether the inode is
      bad_inode in vfs_setup_quota_inode().
      
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20221026042310.3839669-2-libaokun1@huawei.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d552483
    • Jan Kara's avatar
      quota: Factor out setup of quota inode · 3da22d06
      Jan Kara authored
      [ Upstream commit c7d3d283
      
       ]
      
      Factor out setting up of quota inode and eventual error cleanup from
      vfs_load_quota_inode(). This will simplify situation for filesystems
      that don't have any quota inodes.
      
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Stable-dep-of: d3238774
      
       ("ext4: fix bug_on in __es_tree_search caused by bad quota inode")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3da22d06
    • Bixuan Cui's avatar
      jbd2: use the correct print format · ecb9d0d2
      Bixuan Cui authored
      [ Upstream commit d87a7b4c ]
      
      The print format error was found when using ftrace event:
          <...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368
          <...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0
      
      Use the correct print format for transaction, head and tid.
      
      Fixes: 879c5e6b
      
       ('jbd2: convert instrumentation from markers to tracepoints')
      Signed-off-by: default avatarBixuan Cui <cuibixuan@linux.alibaba.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Link: https://lore.kernel.org/r/1665488024-95172-1-git-send-email-cuibixuan@linux.alibaba.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ecb9d0d2
    • Ferry Toth's avatar
      usb: ulpi: defer ulpi_register on ulpi_read_id timeout · 06bb3f4e
      Ferry Toth authored
      [ Upstream commit 8a7b31d5 ]
      
      Since commit 0f010171 ("usb: dwc3: Don't switch OTG -> peripheral
      if extcon is present") Dual Role support on Intel Merrifield platform
      broke due to rearranging the call to dwc3_get_extcon().
      
      It appears to be caused by ulpi_read_id() on the first test write failing
      with -ETIMEDOUT. Currently ulpi_read_id() expects to discover the phy via
      DT when the test write fails and returns 0 in that case, even if DT does not
      provide the phy. As a result usb probe completes without phy.
      
      Make ulpi_read_id() return -ETIMEDOUT to its user if the first test write
      fails. The user should then handle it appropriately. A follow up patch
      will make dwc3_core_init() set -EPROBE_DEFER in this case and bail out.
      
      Fixes: ef6a7bcf
      
       ("usb: ulpi: Support device discovery via DT")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Signed-off-by: default avatarFerry Toth <ftoth@exalondelft.nl>
      Link: https://lore.kernel.org/r/20221205201527.13525-2-ftoth@exalondelft.nl
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      06bb3f4e
    • Michael Walle's avatar
      wifi: wilc1000: sdio: fix module autoloading · a2689a44
      Michael Walle authored
      [ Upstream commit 57d545b5
      
       ]
      
      There are no SDIO module aliases included in the driver, therefore,
      module autoloading isn't working. Add the proper MODULE_DEVICE_TABLE().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Signed-off-by: default avatarKalle Valo <kvalo@kernel.org>
      Link: https://lore.kernel.org/r/20221027171221.491937-1-michael@walle.cc
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a2689a44
    • Herbert Xu's avatar
      ipv6: raw: Deduct extension header length in rawv6_push_pending_frames · 3998dba0
      Herbert Xu authored
      commit cb3e9864
      
       upstream.
      
      The total cork length created by ip6_append_data includes extension
      headers, so we must exclude them when comparing them against the
      IPV6_CHECKSUM offset which does not include extension headers.
      
      Reported-by: default avatarKyle Zeng <zengyhkyle@gmail.com>
      Fixes: 357b40a1
      
       ("[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3998dba0
    • Yang Yingliang's avatar
      ixgbe: fix pci device refcount leak · 53cefa80
      Yang Yingliang authored
      commit b93fb440 upstream.
      
      As the comment of pci_get_domain_bus_and_slot() says, it
      returns a PCI device with refcount incremented, when finish
      using it, the caller must decrement the reference count by
      calling pci_dev_put().
      
      In ixgbe_get_first_secondary_devfn() and ixgbe_x550em_a_has_mii(),
      pci_dev_put() is called to avoid leak.
      
      Fixes: 8fa10ef0
      
       ("ixgbe: register a mdiobus")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53cefa80
    • Hans de Goede's avatar
      platform/x86: sony-laptop: Don't turn off 0x153 keyboard backlight during probe · e0d6f3b6
      Hans de Goede authored
      commit ad75bd85 upstream.
      
      The 0x153 version of the kbd backlight control SNC handle has no separate
      address to probe if the backlight is there.
      
      This turns the probe call into a set keyboard backlight call with a value
      of 0 turning off the keyboard backlight.
      
      Skip probing when there is no separate probe address to avoid this.
      
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1583752
      Fixes: 800f2017
      
       ("Keyboard backlight control for some Vaio Fit models")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-by: default avatarMattia Dongili <malattia@linux.it>
      Link: https://lore.kernel.org/r/20221213122943.11123-1-hdegoede@redhat.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0d6f3b6
    • Konrad Dybcio's avatar
      drm/msm/adreno: Make adreno quirks not overwrite each other · 1ad759df
      Konrad Dybcio authored
      commit 13ef096e upstream.
      
      So far the adreno quirks have all been assigned with an OR operator,
      which is problematic, because they were assigned consecutive integer
      values, which makes checking them with an AND operator kind of no bueno..
      
      Switch to using BIT(n) so that only the quirks that the programmer chose
      are taken into account when evaluating info->quirks & ADRENO_QUIRK_...
      
      Fixes: 370063ee
      
       ("drm/msm/adreno: Add A540 support")
      Reviewed-by: default avatarDmitry Baryshkov <dmitry.baryshkov@linaro.org>
      Reviewed-by: default avatarMarijn Suijten <marijn.suijten@somainline.org>
      Reviewed-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarKonrad Dybcio <konrad.dybcio@linaro.org>
      Reviewed-by: default avatarAkhil P Oommen <quic_akhilpo@quicinc.com>
      Patchwork: https://patchwork.freedesktop.org/patch/516456/
      Link: https://lore.kernel.org/r/20230102100201.77286-1-konrad.dybcio@linaro.org
      
      
      Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ad759df
    • Volker Lendecke's avatar
      cifs: Fix uninitialized memory read for smb311 posix symlink create · 098416c4
      Volker Lendecke authored
      commit a152d05a upstream.
      
      If smb311 posix is enabled, we send the intended mode for file
      creation in the posix create context. Instead of using what's there on
      the stack, create the mfsymlink file with 0644.
      
      Fixes: ce558b0e
      
       ("smb3: Add posix create context for smb3.11 posix mounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVolker Lendecke <vl@samba.org>
      Reviewed-by: default avatarTom Talpey <tom@talpey.com>
      Reviewed-by: default avatarPaulo Alcantara (SUSE) <pc@cjr.nz>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      098416c4
    • Adrian Chan's avatar
      ALSA: hda/hdmi: Add a HP device 0x8715 to force connect list · d6546426
      Adrian Chan authored
      commit de1ccb9e
      
       upstream.
      
      Add the 'HP Engage Flex Mini' device to the force connect list to
      enable audio through HDMI.
      
      Signed-off-by: default avatarAdrian Chan <adchan@google.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20230109210520.16060-1-adchan@google.com
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6546426
    • Clement Lecigne's avatar
      ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF · eaa5580a
      Clement Lecigne authored
      [ Note: this is a fix that works around the bug equivalently as the
        two upstream commits:
         1fa4445f ("ALSA: control - introduce snd_ctl_notify_one() helper")
         56b88b50
      
       ("ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF")
        but in a simpler way to fit with older stable trees -- tiwai ]
      
      Add missing locking in ctl_elem_read_user/ctl_elem_write_user which can be
      easily triggered and turned into an use-after-free.
      
      Example code paths with SNDRV_CTL_IOCTL_ELEM_READ:
      
      64-bits:
      snd_ctl_ioctl
        snd_ctl_elem_read_user
          [takes controls_rwsem]
          snd_ctl_elem_read [lock properly held, all good]
          [drops controls_rwsem]
      
      32-bits (compat):
      snd_ctl_ioctl_compat
        snd_ctl_elem_write_read_compat
          ctl_elem_write_read
            snd_ctl_elem_read [missing lock, not good]
      
      CVE-2023-0266 was assigned for this issue.
      
      Signed-off-by: default avatarClement Lecigne <clecigne@google.com>
      Cc: stable@kernel.org # 5.12 and older
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Reviewed-by: default avatarJaroslav Kysela <perex@perex.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eaa5580a
    • Paolo Abeni's avatar
      net/ulp: prevent ULP without clone op from entering the LISTEN status · c6d29a5f
      Paolo Abeni authored
      commit 2c02d41d upstream.
      
      When an ULP-enabled socket enters the LISTEN status, the listener ULP data
      pointer is copied inside the child/accepted sockets by sk_clone_lock().
      
      The relevant ULP can take care of de-duplicating the context pointer via
      the clone() operation, but only MPTCP and SMC implement such op.
      
      Other ULPs may end-up with a double-free at socket disposal time.
      
      We can't simply clear the ULP data at clone time, as TLS replaces the
      socket ops with custom ones assuming a valid TLS ULP context is
      available.
      
      Instead completely prevent clone-less ULP sockets from entering the
      LISTEN status.
      
      Fixes: 734942cc
      
       ("tcp: ULP infrastructure")
      Reported-by: default avatarslipper <slipper.alive@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.1672740602.git.pabeni@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6d29a5f
    • Heiko Carstens's avatar
      s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() · b318d41f
      Heiko Carstens authored
      commit e3f360db
      
       upstream.
      
      Make sure that *ptr__ within arch_this_cpu_to_op_simple() is only
      dereferenced once by using READ_ONCE(). Otherwise the compiler could
      generate incorrect code.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b318d41f
    • Alexander Egorenkov's avatar
      s390/kexec: fix ipl report address for kdump · f6da927c
      Alexander Egorenkov authored
      commit c2337a40 upstream.
      
      This commit addresses the following erroneous situation with file-based
      kdump executed on a system with a valid IPL report.
      
      On s390, a kdump kernel, its initrd and IPL report if present are loaded
      into a special and reserved on boot memory region - crashkernel. When
      a system crashes and kdump was activated before, the purgatory code
      is entered first which swaps the crashkernel and [0 - crashkernel size]
      memory regions. Only after that the kdump kernel is entered. For this
      reason, the pointer to an IPL report in lowcore must point to the IPL report
      after the swap and not to the address of the IPL report that was located in
      crashkernel memory region before the swap. Failing to do so, makes the
      kdump's decompressor try to read memory from the crashkernel memory region
      which already contains the production's kernel memory.
      
      The situation described above caused spontaneous kdump failures/hangs
      on systems where the Secure IPL is activated because on such systems
      an IPL report is always present. In that case kdump's decompressor tried
      to parse an IPL report which frequently lead to illegal memory accesses
      because an IPL report contains addresses to various data.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 99feaa71
      
       ("s390/kexec_file: Create ipl report and pass to next kernel")
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarAlexander Egorenkov <egorenar@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6da927c
    • Adrian Hunter's avatar
      perf auxtrace: Fix address filter duplicate symbol selection · 4bf6e11c
      Adrian Hunter authored
      commit cf129830 upstream.
      
      When a match has been made to the nth duplicate symbol, return
      success not error.
      
      Example:
      
        Before:
      
          $ cat file.c
          cat: file.c: No such file or directory
          $ cat file1.c
          #include <stdio.h>
      
          static void func(void)
          {
                  printf("First func\n");
          }
      
          void other(void);
      
          int main()
          {
                  func();
                  other();
                  return 0;
          }
          $ cat file2.c
          #include <stdio.h>
      
          static void func(void)
          {
                  printf("Second func\n");
          }
      
          void other(void)
          {
                  func();
          }
      
          $ gcc -Wall -Wextra -o test file1.c file2.c
          $ perf record -e intel_pt//u --filter 'filter func @ ./test' -- ./test
          Multiple symbols with name 'func'
          #1      0x1149  l       func
                          which is near           main
          #2      0x1179  l       func
                          which is near           other
          Disambiguate symbol name by inserting #n after the name e.g. func #2
          Or select a global symbol by inserting #0 or #g or #G
          Failed to parse address filter: 'filter func @ ./test'
          Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
          Where multiple filters are separated by space or comma.
          $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test
          Failed to parse address filter: 'filter func #2 @ ./test'
          Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
          Where multiple filters are separated by space or comma.
      
        After:
      
          $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test
          First func
          Second func
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.016 MB perf.data ]
          $ perf script --itrace=b -Ftime,flags,ip,sym,addr --ns
          1231062.526977619:   tr strt                               0 [unknown] =>     558495708179 func
          1231062.526977619:   tr end  call               558495708188 func =>     558495708050 _init
          1231062.526979286:   tr strt                               0 [unknown] =>     55849570818d func
          1231062.526979286:   tr end  return             55849570818f func =>     55849570819d other
      
      Fixes: 1b36c03e
      
       ("perf record: Add support for using symbols in address filters")
      Reported-by: default avatarDmitrii Dolgov <9erthalion6@gmail.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarDmitry Dolgov <9erthalion6@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230110185659.15979-1-adrian.hunter@intel.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bf6e11c
    • Jonathan Corbet's avatar
      docs: Fix the docs build with Sphinx 6.0 · 2e4164d3
      Jonathan Corbet authored
      commit 0283189e
      
       upstream.
      
      Sphinx 6.0 removed the execfile_() function, which we use as part of the
      configuration process.  They *did* warn us...  Just open-code the
      functionality as is done in Sphinx itself.
      
      Tested (using SPHINX_CONF, since this code is only executed with an
      alternative config file) on various Sphinx versions from 2.5 through 6.0.
      
      Reported-by: default avatarMartin Liška <mliska@suse.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e4164d3
    • Ard Biesheuvel's avatar
      efi: tpm: Avoid READ_ONCE() for accessing the event log · 3ed18307
      Ard Biesheuvel authored
      commit d3f45053
      
       upstream.
      
      Nathan reports that recent kernels built with LTO will crash when doing
      EFI boot using Fedora's GRUB and SHIM. The culprit turns out to be a
      misaligned load from the TPM event log, which is annotated with
      READ_ONCE(), and under LTO, this gets translated into a LDAR instruction
      which does not tolerate misaligned accesses.
      
      Interestingly, this does not happen when booting the same kernel
      straight from the UEFI shell, and so the fact that the event log may
      appear misaligned in memory may be caused by a bug in GRUB or SHIM.
      
      However, using READ_ONCE() to access firmware tables is slightly unusual
      in any case, and here, we only need to ensure that 'event' is not
      dereferenced again after it gets unmapped, but this is already taken
      care of by the implicit barrier() semantics of the early_memunmap()
      call.
      
      Cc: <stable@vger.kernel.org>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Jarkko Sakkinen <jarkko@kernel.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: https://github.com/ClangBuiltLinux/linux/issues/1782
      
      
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ed18307
    • Marc Zyngier's avatar
      KVM: arm64: Fix S1PTW handling on RO memslots · 3ad31129
      Marc Zyngier authored
      commit 406504c7 upstream.
      
      A recent development on the EFI front has resulted in guests having
      their page tables baked in the firmware binary, and mapped into the
      IPA space as part of a read-only memslot. Not only is this legitimate,
      but it also results in added security, so thumbs up.
      
      It is possible to take an S1PTW translation fault if the S1 PTs are
      unmapped at stage-2. However, KVM unconditionally treats S1PTW as a
      write to correctly handle hardware AF/DB updates to the S1 PTs.
      Furthermore, KVM injects an exception into the guest for S1PTW writes.
      In the aforementioned case this results in the guest taking an abort
      it won't recover from, as the S1 PTs mapping the vectors suffer from
      the same problem.
      
      So clearly our handling is... wrong.
      
      Instead, switch to a two-pronged approach:
      
      - On S1PTW translation fault, handle the fault as a read
      
      - On S1PTW permission fault, handle the fault as a write
      
      This is of no consequence to SW that *writes* to its PTs (the write
      will trigger a non-S1PTW fault), and SW that uses RO PTs will not
      use HW-assisted AF/DB anyway, as that'd be wrong.
      
      Only in the case described in c4ad98e4 ("KVM: arm64: Assume write
      fault on S1PTW permission fault on instruction fetch") do we end-up
      with two back-to-back faults (page being evicted and faulted back).
      I don't think this is a case worth optimising for.
      
      Fixes: c4ad98e4
      
       ("KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch")
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Regression-tested-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ad31129
    • Frederick Lawler's avatar
      net: sched: disallow noqueue for qdisc classes · 9b83ec63
      Frederick Lawler authored
      commit 96398560 upstream.
      
      While experimenting with applying noqueue to a classful queue discipline,
      we discovered a NULL pointer dereference in the __dev_queue_xmit()
      path that generates a kernel OOPS:
      
          # dev=enp0s5
          # tc qdisc replace dev $dev root handle 1: htb default 1
          # tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit
          # tc qdisc add dev $dev parent 1:1 handle 10: noqueue
          # ping -I $dev -w 1 -c 1 1.1.1.1
      
      [    2.172856] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [    2.173217] #PF: supervisor instruction fetch in kernel mode
      ...
      [    2.178451] Call Trace:
      [    2.178577]  <TASK>
      [    2.178686]  htb_enqueue+0x1c8/0x370
      [    2.178880]  dev_qdisc_enqueue+0x15/0x90
      [    2.179093]  __dev_queue_xmit+0x798/0xd00
      [    2.179305]  ? _raw_write_lock_bh+0xe/0x30
      [    2.179522]  ? __local_bh_enable_ip+0x32/0x70
      [    2.179759]  ? ___neigh_create+0x610/0x840
      [    2.179968]  ? eth_header+0x21/0xc0
      [    2.180144]  ip_finish_output2+0x15e/0x4f0
      [    2.180348]  ? dst_output+0x30/0x30
      [    2.180525]  ip_push_pending_frames+0x9d/0xb0
      [    2.180739]  raw_sendmsg+0x601/0xcb0
      [    2.180916]  ? _raw_spin_trylock+0xe/0x50
      [    2.181112]  ? _raw_spin_unlock_irqrestore+0x16/0x30
      [    2.181354]  ? get_page_from_freelist+0xcd6/0xdf0
      [    2.181594]  ? sock_sendmsg+0x56/0x60
      [    2.181781]  sock_sendmsg+0x56/0x60
      [    2.181958]  __sys_sendto+0xf7/0x160
      [    2.182139]  ? handle_mm_fault+0x6e/0x1d0
      [    2.182366]  ? do_user_addr_fault+0x1e1/0x660
      [    2.182627]  __x64_sys_sendto+0x1b/0x30
      [    2.182881]  do_syscall_64+0x38/0x90
      [    2.183085]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      ...
      [    2.187402]  </TASK>
      
      Previously in commit d66d6c31 ("net: sched: register noqueue
      qdisc"), NULL was set for the noqueue discipline on noqueue init
      so that __dev_queue_xmit() falls through for the noqueue case. This
      also sets a bypass of the enqueue NULL check in the
      register_qdisc() function for the struct noqueue_disc_ops.
      
      Classful queue disciplines make it past the NULL check in
      __dev_queue_xmit() because the discipline is set to htb (in this case),
      and then in the call to __dev_xmit_skb(), it calls into htb_enqueue()
      which grabs a leaf node for a class and then calls qdisc_enqueue() by
      passing in a queue discipline which assumes ->enqueue() is not set to NULL.
      
      Fix this by not allowing classes to be assigned to the noqueue
      discipline. Linux TC Notes states that classes cannot be set to
      the noqueue discipline. [1] Let's enforce that here.
      
      Links:
      1. https://linux-tc-notes.sourceforge.net/tc/doc/sch_noqueue.txt
      
      Fixes: d66d6c31
      
       ("net: sched: register noqueue qdisc")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFrederick Lawler <fred@cloudflare.com>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/r/20230109163906.706000-1-fred@cloudflare.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b83ec63
    • Isaac J. Manjarres's avatar
      driver core: Fix bus_type.match() error handling in __driver_attach() · aa52acef
      Isaac J. Manjarres authored
      commit 27c0d217 upstream.
      
      When a driver registers with a bus, it will attempt to match with every
      device on the bus through the __driver_attach() function. Currently, if
      the bus_type.match() function encounters an error that is not
      -EPROBE_DEFER, __driver_attach() will return a negative error code, which
      causes the driver registration logic to stop trying to match with the
      remaining devices on the bus.
      
      This behavior is not correct; a failure while matching a driver to a
      device does not mean that the driver won't be able to match and bind
      with other devices on the bus. Update the logic in __driver_attach()
      to reflect this.
      
      Fixes: 656b8035
      
       ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
      Cc: stable@vger.kernel.org
      Cc: Saravana Kannan <saravanak@google.com>
      Signed-off-by: default avatarIsaac J. Manjarres <isaacmanjarres@google.com>
      Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa52acef
    • Muhammad Usama Anjum's avatar
      selftests: set the BUILD variable to absolute path · 8d60a905
      Muhammad Usama Anjum authored
      commit 5ad51ab6
      
       upstream.
      
      The build of kselftests fails if relative path is specified through
      KBUILD_OUTPUT or O=<path> method. BUILD variable is used to determine
      the path of the output objects. When make is run from other directories
      with relative paths, the exact path of the build objects is ambiguous
      and build fails.
      
      	make[1]: Entering directory '/home/usama/repos/kernel/linux_mainline2/tools/testing/selftests/alsa'
      	gcc     mixer-test.c -L/usr/lib/x86_64-linux-gnu -lasound  -o build/kselftest/alsa/mixer-test
      	/usr/bin/ld: cannot open output file build/kselftest/alsa/mixer-test
      
      Set the BUILD variable to the absolute path of the output directory.
      Make the logic readable and easy to follow. Use spaces instead of tabs
      for indentation as if with tab indentation is considered recipe in make.
      
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarTyler Hicks (Microsoft) <code@tyhicks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d60a905
    • Shuah Khan's avatar
      selftests: Fix kselftest O=objdir build from cluttering top level objdir · cad6d2bb
      Shuah Khan authored
      commit 29e911ef
      
       upstream.
      
      make kselftest-all O=objdir builds create generated objects in objdir.
      This clutters the top level directory with kselftest objects. Fix it
      to create sub-directory under objdir for kselftest objects.
      
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarTyler Hicks (Microsoft) <code@tyhicks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cad6d2bb
    • Helge Deller's avatar
      parisc: Align parisc MADV_XXX constants with all other architectures · 320dbbd8
      Helge Deller authored
      commit 71bdea6f
      
       upstream.
      
      Adjust some MADV_XXX constants to be in sync what their values are on
      all other platforms. There is currently no reason to have an own
      numbering on parisc, but it requires workarounds in many userspace
      sources (e.g. glibc, qemu, ...) - which are often forgotten and thus
      introduce bugs and different behaviour on parisc.
      
      A wrapper avoids an ABI breakage for existing userspace applications by
      translating any old values to the new ones, so this change allows us to
      move over all programs to the new ABI over time.
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      320dbbd8
    • Jan Kara's avatar
      mbcache: Avoid nesting of cache->c_list_lock under bit locks · d868597b
      Jan Kara authored
      commit 5fc4cbd9 upstream.
      
      Commit 307af6c8 ("mbcache: automatically delete entries from cache
      on freeing") started nesting cache->c_list_lock under the bit locks
      protecting hash buckets of the mbcache hash table in
      mb_cache_entry_create(). This causes problems for real-time kernels
      because there spinlocks are sleeping locks while bitlocks stay atomic.
      Luckily the nesting is easy to avoid by holding entry reference until
      the entry is added to the LRU list. This makes sure we cannot race with
      entry deletion.
      
      Cc: stable@kernel.org
      Fixes: 307af6c8
      
       ("mbcache: automatically delete entries from cache on freeing")
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220908091032.10513-1-jack@suse.cz
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d868597b