Skip to content
  1. Jan 18, 2023
    • Mark Rutland's avatar
      arm64: cmpxchg_double*: hazard against entire exchange variable · 94b6cf84
      Mark Rutland authored
      [ Upstream commit 031af500 ]
      
      The inline assembly for arm64's cmpxchg_double*() implementations use a
      +Q constraint to hazard against other accesses to the memory location
      being exchanged. However, the pointer passed to the constraint is a
      pointer to unsigned long, and thus the hazard only applies to the first
      8 bytes of the location.
      
      GCC can take advantage of this, assuming that other portions of the
      location are unchanged, leading to a number of potential problems.
      
      This is similar to what we fixed back in commit:
      
        fee960be ("arm64: xchg: hazard against entire exchange variable")
      
      ... but we forgot to adjust cmpxchg_double*() similarly at the same
      time.
      
      The same problem applies, as demonstrated with the following test:
      
      | struct big {
      |         u64 lo, hi;
      | } __aligned(128);
      |
      | unsigned long foo(struct big *b)
      | {
      |         u64 hi_old, hi_new;
      |
      |         hi_old = b->hi;
      |         cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
      |         hi_new = b->hi;
      |
      |         return hi_old ^ hi_new;
      | }
      
      ... which GCC 12.1.0 compiles as:
      
      | 0000000000000000 <foo>:
      |    0:   d503233f        paciasp
      |    4:   aa0003e4        mov     x4, x0
      |    8:   1400000e        b       40 <foo+0x40>
      |    c:   d2800240        mov     x0, #0x12                       // #18
      |   10:   d2800681        mov     x1, #0x34                       // #52
      |   14:   aa0003e5        mov     x5, x0
      |   18:   aa0103e6        mov     x6, x1
      |   1c:   d2800ac2        mov     x2, #0x56                       // #86
      |   20:   d2800f03        mov     x3, #0x78                       // #120
      |   24:   48207c82        casp    x0, x1, x2, x3, [x4]
      |   28:   ca050000        eor     x0, x0, x5
      |   2c:   ca060021        eor     x1, x1, x6
      |   30:   aa010000        orr     x0, x0, x1
      |   34:   d2800000        mov     x0, #0x0                        // #0    <--- BANG
      |   38:   d50323bf        autiasp
      |   3c:   d65f03c0        ret
      |   40:   d2800240        mov     x0, #0x12                       // #18
      |   44:   d2800681        mov     x1, #0x34                       // #52
      |   48:   d2800ac2        mov     x2, #0x56                       // #86
      |   4c:   d2800f03        mov     x3, #0x78                       // #120
      |   50:   f9800091        prfm    pstl1strm, [x4]
      |   54:   c87f1885        ldxp    x5, x6, [x4]
      |   58:   ca0000a5        eor     x5, x5, x0
      |   5c:   ca0100c6        eor     x6, x6, x1
      |   60:   aa0600a6        orr     x6, x5, x6
      |   64:   b5000066        cbnz    x6, 70 <foo+0x70>
      |   68:   c8250c82        stxp    w5, x2, x3, [x4]
      |   6c:   35ffff45        cbnz    w5, 54 <foo+0x54>
      |   70:   d2800000        mov     x0, #0x0                        // #0     <--- BANG
      |   74:   d50323bf        autiasp
      |   78:   d65f03c0        ret
      
      Notice that at the lines with "BANG" comments, GCC has assumed that the
      higher 8 bytes are unchanged by the cmpxchg_double() call, and that
      `hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
      LL/SC versions of cmpxchg_double().
      
      This patch fixes the issue by passing a pointer to __uint128_t into the
      +Q constraint, ensuring that the compiler hazards against the entire 16
      bytes being modified.
      
      With this change, GCC 12.1.0 compiles the above test as:
      
      | 0000000000000000 <foo>:
      |    0:   f9400407        ldr     x7, [x0, #8]
      |    4:   d503233f        paciasp
      |    8:   aa0003e4        mov     x4, x0
      |    c:   1400000f        b       48 <foo+0x48>
      |   10:   d2800240        mov     x0, #0x12                       // #18
      |   14:   d2800681        mov     x1, #0x34                       // #52
      |   18:   aa0003e5        mov     x5, x0
      |   1c:   aa0103e6        mov     x6, x1
      |   20:   d2800ac2        mov     x2, #0x56                       // #86
      |   24:   d2800f03        mov     x3, #0x78                       // #120
      |   28:   48207c82        casp    x0, x1, x2, x3, [x4]
      |   2c:   ca050000        eor     x0, x0, x5
      |   30:   ca060021        eor     x1, x1, x6
      |   34:   aa010000        orr     x0, x0, x1
      |   38:   f9400480        ldr     x0, [x4, #8]
      |   3c:   d50323bf        autiasp
      |   40:   ca0000e0        eor     x0, x7, x0
      |   44:   d65f03c0        ret
      |   48:   d2800240        mov     x0, #0x12                       // #18
      |   4c:   d2800681        mov     x1, #0x34                       // #52
      |   50:   d2800ac2        mov     x2, #0x56                       // #86
      |   54:   d2800f03        mov     x3, #0x78                       // #120
      |   58:   f9800091        prfm    pstl1strm, [x4]
      |   5c:   c87f1885        ldxp    x5, x6, [x4]
      |   60:   ca0000a5        eor     x5, x5, x0
      |   64:   ca0100c6        eor     x6, x6, x1
      |   68:   aa0600a6        orr     x6, x5, x6
      |   6c:   b5000066        cbnz    x6, 78 <foo+0x78>
      |   70:   c8250c82        stxp    w5, x2, x3, [x4]
      |   74:   35ffff45        cbnz    w5, 5c <foo+0x5c>
      |   78:   f9400480        ldr     x0, [x4, #8]
      |   7c:   d50323bf        autiasp
      |   80:   ca0000e0        eor     x0, x7, x0
      |   84:   d65f03c0        ret
      
      ... sampling the high 8 bytes before and after the cmpxchg, and
      performing an EOR, as we'd expect.
      
      For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
      that linux-4.9.y is oldest currently supported stable release, and
      mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
      on my machines due to library incompatibilities.
      
      I've also used a standalone test to check that we can use a __uint128_t
      pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
      3.9.1.
      
      Fixes: 5284e1b4 ("arm64: xchg: Implement cmpxchg_double")
      Fixes: e9a4b795
      
       ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
      Reported-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
      
      
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/lkml/Y6GXoO4qmH9OIZ5Q@hirez.programming.kicks-ass.net/
      
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: stable@vger.kernel.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230104151626.3262137-1-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      94b6cf84
    • Mark Rutland's avatar
      arm64: atomics: remove LL/SC trampolines · 3891fa49
      Mark Rutland authored
      [ Upstream commit b2c3ccbd ]
      
      When CONFIG_ARM64_LSE_ATOMICS=y, each use of an LL/SC atomic results in
      a fragment of code being generated in a subsection without a clear
      association with its caller. A trampoline in the caller branches to the
      LL/SC atomic with with a direct branch, and the atomic directly branches
      back into its trampoline.
      
      This breaks backtracing, as any PC within the out-of-line fragment will
      be symbolized as an offset from the nearest prior symbol (which may not
      be the function using the atomic), and since the atomic returns with a
      direct branch, the caller's PC may be missing from the backtrace.
      
      For example, with secondary_start_kernel() hacked to contain
      atomic_inc(NULL), the resulting exception can be reported as being taken
      from cpus_are_stuck_in_kernel():
      
      | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      | Mem abort info:
      |   ESR = 0x0000000096000004
      |   EC = 0x25: DABT (current EL), IL = 32 bits
      |   SET = 0, FnV = 0
      |   EA = 0, S1PTW = 0
      |   FSC = 0x04: level 0 translation fault
      | Data abort info:
      |   ISV = 0, ISS = 0x00000004
      |   CM = 0, WnR = 0
      | [0000000000000000] user address but active_mm is swapper
      | Internal error: Oops: 96000004 [#1] PREEMPT SMP
      | Modules linked in:
      | CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19.0-11219-geb555cb5b794-dirty #3
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      | pc : cpus_are_stuck_in_kernel+0xa4/0x120
      | lr : secondary_start_kernel+0x164/0x170
      | sp : ffff80000a4cbe90
      | x29: ffff80000a4cbe90 x28: 0000000000000000 x27: 0000000000000000
      | x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
      | x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
      | x20: 0000000000000001 x19: 0000000000000001 x18: 0000000000000008
      | x17: 3030383832343030 x16: 3030303030307830 x15: ffff80000a4cbab0
      | x14: 0000000000000001 x13: 5d31666130663133 x12: 3478305b20313030
      | x11: 3030303030303078 x10: 3020726f73736563 x9 : 726f737365636f72
      | x8 : ffff800009ff2ef0 x7 : 0000000000000003 x6 : 0000000000000000
      | x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000100
      | x2 : 0000000000000000 x1 : ffff0000029bd880 x0 : 0000000000000000
      | Call trace:
      |  cpus_are_stuck_in_kernel+0xa4/0x120
      |  __secondary_switched+0xb0/0xb4
      | Code: 35ffffa3 17fffc6c d53cd040 f9800011 (885f7c01)
      | ---[ end trace 0000000000000000 ]---
      
      This is confusing and hinders debugging, and will be problematic for
      CONFIG_LIVEPATCH as these cases cannot be unwound reliably.
      
      This is very similar to recent issues with out-of-line exception fixups,
      which were removed in commits:
      
        35d67794 ("arm64: lib: __arch_clear_user(): fold fixups into body")
        4012e0e2 ("arm64: lib: __arch_copy_from_user(): fold fixups into body")
        139f9ab7 ("arm64: lib: __arch_copy_to_user(): fold fixups into body")
      
      When the trampolines were introduced in commit:
      
        addfc386
      
       ("arm64: atomics: avoid out-of-line ll/sc atomics")
      
      The rationale was to improve icache performance by grouping the LL/SC
      atomics together. This has never been measured, and this theoretical
      benefit is outweighed by other factors:
      
      * As the subsections are collapsed into sections at object file
        granularity, these are spread out throughout the kernel and can share
        cachelines with unrelated code regardless.
      
      * GCC 12.1.0 has been observed to place the trampoline out-of-line in
        specialised __ll_sc_*() functions, introducing more branching than was
        intended.
      
      * Removing the trampolines has been observed to shrink a defconfig
        kernel Image by 64KiB when building with GCC 12.1.0.
      
      This patch removes the LL/SC trampolines, meaning that the LL/SC atomics
      will be inlined into their callers (or placed in out-of line functions
      using regular BL/RET pairs). When CONFIG_ARM64_LSE_ATOMICS=y, the LL/SC
      atomics are always called in an unlikely branch, and will be placed in a
      cold portion of the function, so this should have minimal impact to the
      hot paths.
      
      Other than the improved backtracing, there should be no functional
      change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20220817155914.3975112-2-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Stable-dep-of: 031af500
      
       ("arm64: cmpxchg_double*: hazard against entire exchange variable")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3891fa49
    • Mark Rutland's avatar
      arm64: atomics: format whitespace consistently · 61e86339
      Mark Rutland authored
      [ Upstream commit 8e6082e9
      
       ]
      
      The code for the atomic ops is formatted inconsistently, and while this
      is not a functional problem it is rather distracting when working on
      them.
      
      Some have ops have consistent indentation, e.g.
      
      | #define ATOMIC_OP_ADD_RETURN(name, mb, cl...)                           \
      | static inline int __lse_atomic_add_return##name(int i, atomic_t *v)     \
      | {                                                                       \
      |         u32 tmp;                                                        \
      |                                                                         \
      |         asm volatile(                                                   \
      |         __LSE_PREAMBLE                                                  \
      |         "       ldadd" #mb "    %w[i], %w[tmp], %[v]\n"                 \
      |         "       add     %w[i], %w[i], %w[tmp]"                          \
      |         : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp)        \
      |         : "r" (v)                                                       \
      |         : cl);                                                          \
      |                                                                         \
      |         return i;                                                       \
      | }
      
      While others have negative indentation for some lines, and/or have
      misaligned trailing backslashes, e.g.
      
      | static inline void __lse_atomic_##op(int i, atomic_t *v)                        \
      | {                                                                       \
      |         asm volatile(                                                   \
      |         __LSE_PREAMBLE                                                  \
      | "       " #asm_op "     %w[i], %[v]\n"                                  \
      |         : [i] "+r" (i), [v] "+Q" (v->counter)                           \
      |         : "r" (v));                                                     \
      | }
      
      This patch makes the indentation consistent and also aligns the trailing
      backslashes. This makes the code easier to read for those (like myself)
      who are easily distracted by these inconsistencies.
      
      This is intended as a cleanup.
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Stable-dep-of: 031af500
      
       ("arm64: cmpxchg_double*: hazard against entire exchange variable")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      61e86339
    • Pavel Begunkov's avatar
      io_uring: lock overflowing for IOPOLL · ed4629d1
      Pavel Begunkov authored
      commit 544d163d
      
       upstream.
      
      syzbot reports an issue with overflow filling for IOPOLL:
      
      WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734
      CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0
      Workqueue: events_unbound io_ring_exit_work
      Call trace:
       io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734
       io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773
       io_fill_cqe_req io_uring/io_uring.h:168 [inline]
       io_do_iopoll+0x474/0x62c io_uring/rw.c:1065
       io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513
       io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056
       io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869
       process_one_work+0x2d8/0x504 kernel/workqueue.c:2289
       worker_thread+0x340/0x610 kernel/workqueue.c:2436
       kthread+0x12c/0x158 kernel/kthread.c:376
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863
      
      There is no real problem for normal IOPOLL as flush is also called with
      uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL,
      for which __io_cqring_overflow_flush() happens from the CQ waiting path.
      
      Reported-and-tested-by: default avatar <syzbot+6805087452d72929404e@syzkaller.appspotmail.com>
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ed4629d1
    • Paolo Bonzini's avatar
      KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID · fbf50151
      Paolo Bonzini authored
      [ Upstream commit 45e966fc
      
       ]
      
      Passing the host topology to the guest is almost certainly wrong
      and will confuse the scheduler.  In addition, several fields of
      these CPUID leaves vary on each processor; it is simply impossible to
      return the right values from KVM_GET_SUPPORTED_CPUID in such a way that
      they can be passed to KVM_SET_CPUID2.
      
      The values that will most likely prevent confusion are all zeroes.
      Userspace will have to override it anyway if it wishes to present a
      specific topology to the guest.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fbf50151
    • Paolo Bonzini's avatar
      Documentation: KVM: add API issues section · ee168411
      Paolo Bonzini authored
      [ Upstream commit cde363ab
      
       ]
      
      Add a section to document all the different ways in which the KVM API sucks.
      
      I am sure there are way more, give people a place to vent so that userspace
      authors are aware.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220322110712.222449-4-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee168411
    • Aaron Thompson's avatar
      mm: Always release pages to the buddy allocator in memblock_free_late(). · b8f3b3cf
      Aaron Thompson authored
      [ Upstream commit 115d9d77 ]
      
      If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages()
      only releases pages to the buddy allocator if they are not in the
      deferred range. This is correct for free pages (as defined by
      for_each_free_mem_pfn_range_in_zone()) because free pages in the
      deferred range will be initialized and released as part of the deferred
      init process. memblock_free_pages() is called by memblock_free_late(),
      which is used to free reserved ranges after memblock_free_all() has
      run. All pages in reserved ranges have been initialized at that point,
      and accordingly, those pages are not touched by the deferred init
      process. This means that currently, if the pages that
      memblock_free_late() intends to release are in the deferred range, they
      will never be released to the buddy allocator. They will forever be
      reserved.
      
      In addition, memblock_free_pages() calls kmsan_memblock_free_pages(),
      which is also correct for free pages but is not correct for reserved
      pages. KMSAN metadata for reserved pages is initialized by
      kmsan_init_shadow(), which runs shortly before memblock_free_all().
      
      For both of these reasons, memblock_free_pages() should only be called
      for free pages, and memblock_free_late() should call __free_pages_core()
      directly instead.
      
      One case where this issue can occur in the wild is EFI boot on
      x86_64. The x86 EFI code reserves all EFI boot services memory ranges
      via memblock_reserve() and frees them later via memblock_free_late()
      (efi_reserve_boot_services() and efi_free_boot_services(),
      respectively). If any of those ranges happens to fall within the
      deferred init range, the pages will not be released and that memory will
      be unavailable.
      
      For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI:
      
      v6.2-rc2:
        # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
        Node 0, zone      DMA
                spanned  4095
                present  3999
                managed  3840
        Node 0, zone    DMA32
                spanned  246652
                present  245868
                managed  178867
      
      v6.2-rc2 + patch:
        # grep -E 'Node|spanned|present|managed' /proc/zoneinfo
        Node 0, zone      DMA
                spanned  4095
                present  3999
                managed  3840
        Node 0, zone    DMA32
                spanned  246652
                present  245868
                managed  222816   # +43,949 pages
      
      Fixes: 3a80a7fa
      
       ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
      Signed-off-by: default avatarAaron Thompson <dev@aaront.org>
      Link: https://lore.kernel.org/r/01010185892de53e-e379acfb-7044-4b24-b30a-e2657c1ba989-000000@us-west-2.amazonses.com
      
      
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b8f3b3cf
    • Maximilian Luz's avatar
      platform/surface: aggregator: Add missing call to ssam_request_sync_free() · d2dc110d
      Maximilian Luz authored
      [ Upstream commit c965daac ]
      
      Although rare, ssam_request_sync_init() can fail. In that case, the
      request should be freed via ssam_request_sync_free(). Currently it is
      leaked instead. Fix this.
      
      Fixes: c167b9c7
      
       ("platform/surface: Add Surface Aggregator subsystem")
      Signed-off-by: default avatarMaximilian Luz <luzmaximilian@gmail.com>
      Link: https://lore.kernel.org/r/20221220175608.1436273-1-luzmaximilian@gmail.com
      
      
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d2dc110d
    • Christopher S Hall's avatar
      igc: Fix PPS delta between two synchronized end-points · cfd59784
      Christopher S Hall authored
      [ Upstream commit 5e91c72e ]
      
      This patch fix the pulse per second output delta between
      two synchronized end-points.
      
      Based on Intel Discrete I225 Software User Manual Section
      4.2.15 TimeSync Auxiliary Control Register, ST0[Bit 4] and
      ST1[Bit 7] must be set to ensure that clock output will be
      toggles based on frequency value defined. This is to ensure
      that output of the PPS is aligned with the clock.
      
      How to test:
      
      1) Running time synchronization on both end points.
      Ex: ptp4l --step_threshold=1 -m -f gPTP.cfg -i <interface name>
      
      2) Configure PPS output using below command for both end-points
      Ex: SDP0 on I225 REV4 SKU variant
      
      ./testptp -d /dev/ptp0 -L 0,2
      ./testptp -d /dev/ptp0 -p 1000000000
      
      3) Measure the output using analyzer for both end-points
      
      Fixes: 87938851
      
       ("igc: enable auxiliary PHC functions for the i225")
      Signed-off-by: default avatarChristopher S Hall <christopher.s.hall@intel.com>
      Signed-off-by: default avatarMuhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
      Acked-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cfd59784
    • Ian Rogers's avatar
      perf build: Properly guard libbpf includes · 0bf52601
      Ian Rogers authored
      [ Upstream commit d891f2b7 ]
      
      Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT.
      In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL.
      
      Fixes: d6a735ef
      
       ("perf bpf_counter: Move common functions to bpf_counter.h")
      Reported-by: default avatarMike Leach <mike.leach@linaro.org>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarMike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0bf52601
    • Gavin Li's avatar
      net/mlx5e: Don't support encap rules with gbp option · 205f35ee
      Gavin Li authored
      [ Upstream commit d515d63c ]
      
      Previously, encap rules with gbp option would be offloaded by mistake but
      driver does not support gbp option offload.
      
      To fix this issue, check if the encap rule has gbp option and don't
      offload the rule
      
      Fixes: d8f9dfae
      
       ("net: sched: allow flower to match vxlan options")
      Signed-off-by: default avatarGavin Li <gavinl@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      205f35ee
    • Rahul Rameshbabu's avatar
      net/mlx5: Fix ptp max frequency adjustment range · 0526fc93
      Rahul Rameshbabu authored
      [ Upstream commit fe91d572 ]
      
      .max_adj of ptp_clock_info acts as an absolute value for the amount in ppb
      that can be set for a single call of .adjfine. This means that a single
      call to .getfine cannot be greater than .max_adj or less than -(.max_adj).
      Provides correct value for max frequency adjustment value supported by
      devices.
      
      Fixes: 3d8c38af
      
       ("net/mlx5e: Add PTP Hardware Clock (PHC) support")
      Signed-off-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0526fc93
    • Ido Schimmel's avatar
      net/sched: act_mpls: Fix warning during failed attribute validation · 9e2c3882
      Ido Schimmel authored
      [ Upstream commit 9e17f992 ]
      
      The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a
      validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid
      combination according to the comment above 'struct nla_policy':
      
      "
      Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN:
         NLA_BINARY           Validation function called for the attribute.
         All other            Unused - but note that it's a union
      "
      
      This can trigger the warning [1] in nla_get_range_unsigned() when
      validation of the attribute fails. Despite being of 'NLA_U32' type, the
      associated 'min'/'max' fields in the policy are negative as they are
      aliased by the 'validate' field.
      
      Fix by changing the attribute type to 'NLA_BINARY' which is consistent
      with the above comment and all other users of NLA_POLICY_VALIDATE_FN().
      As a result, move the length validation to the validation function.
      
      No regressions in MPLS tests:
      
       # ./tdc.py -f tc-tests/actions/mpls.json
       [...]
       # echo $?
       0
      
      [1]
      WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118
      nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
      Modules linked in:
      CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
      RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
      [...]
      Call Trace:
       <TASK>
       __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310
       netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411
       netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline]
       netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506
       netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546
       rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg net/socket.c:734 [inline]
       ____sys_sendmsg+0x38f/0x500 net/socket.c:2482
       ___sys_sendmsg net/socket.c:2536 [inline]
       __sys_sendmsg+0x197/0x230 net/socket.c:2565
       __do_sys_sendmsg net/socket.c:2574 [inline]
       __se_sys_sendmsg net/socket.c:2572 [inline]
       __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/
      Fixes: 2a2ea508
      
       ("net: sched: add mpls manipulation actions to TC")
      Reported-by: default avatarWei Chen <harperchen1110@gmail.com>
      Tested-by: default avatarWei Chen <harperchen1110@gmail.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9e2c3882
    • Willy Tarreau's avatar
      tools/nolibc: fix the O_* fcntl/open macro definitions for riscv · e3bb44be
      Willy Tarreau authored
      [ Upstream commit 00b18da4 ]
      
      When RISCV port was imported in 5.2, the O_* macros were taken with
      their octal value and written as-is in hex, resulting in the getdents64()
      to fail in nolibc-test.
      
      Fixes: 582e84f7
      
       ("tool headers nolibc: add RISCV support") #5.2
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e3bb44be
    • Willy Tarreau's avatar
      tools/nolibc: restore mips branch ordering in the _start block · 1e6ec75b
      Willy Tarreau authored
      [ Upstream commit 184177c3 ]
      
      Depending on the compiler used and the optimization options, the sbrk()
      test was crashing, both on real hardware (mips-24kc) and in qemu. One
      such example is kernel.org toolchain in version 11.3 optimizing at -Os.
      
      Inspecting the sys_brk() call shows the following code:
      
        0040047c <sys_brk>:
          40047c:       24020fcd        li      v0,4045
          400480:       27bdffe0        addiu   sp,sp,-32
          400484:       0000000c        syscall
          400488:       27bd0020        addiu   sp,sp,32
          40048c:       10e00001        beqz    a3,400494 <sys_brk+0x18>
          400490:       00021023        negu    v0,v0
          400494:       03e00008        jr      ra
      
      It is obviously wrong, the "negu" instruction is placed in beqz's
      delayed slot, and worse, there's no nop nor instruction after the
      return, so the next function's first instruction (addiu sip,sip,-32)
      will also be executed as part of the delayed slot that follows the
      return.
      
      This is caused by the ".set noreorder" directive in the _start block,
      that applies to the whole program. The compiler emits code without the
      delayed slots and relies on the compiler to swap instructions when this
      option is not set. Removing the option would require to change the
      startup code in a way that wouldn't make it look like the resulting
      code, which would not be easy to debug. Instead let's just save the
      default ordering before changing it, and restore it at the end of the
      _start block. Now the code is correct:
      
        0040047c <sys_brk>:
          40047c:       24020fcd        li      v0,4045
          400480:       27bdffe0        addiu   sp,sp,-32
          400484:       0000000c        syscall
          400488:       10e00002        beqz    a3,400494 <sys_brk+0x18>
          40048c:       27bd0020        addiu   sp,sp,32
          400490:       00021023        negu    v0,v0
          400494:       03e00008        jr      ra
          400498:       00000000        nop
      
      Fixes: 66b6f755
      
       ("rcutorture: Import a copy of nolibc") #5.0
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1e6ec75b
    • Ammar Faizi's avatar
      tools/nolibc: Remove .global _start from the entry point code · bd0431a6
      Ammar Faizi authored
      [ Upstream commit 1590c598
      
       ]
      
      Building with clang yields the following error:
      ```
        <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
        .global _start
        ^
        1 error generated.
      ```
      Make sure only specify one between `.global _start` and `.weak _start`.
      Remove `.global _start`.
      
      Cc: llvm@lists.linux.dev
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarAmmar Faizi <ammarfaizi2@gnuweeb.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd0431a6
    • Willy Tarreau's avatar
      tools/nolibc/arch: mark the _start symbol as weak · a77c54f5
      Willy Tarreau authored
      [ Upstream commit dffeb81a
      
       ]
      
      By doing so we can link together multiple C files that have been compiled
      with nolibc and which each have a _start symbol.
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a77c54f5
    • Willy Tarreau's avatar
      tools/nolibc/arch: split arch-specific code into individual files · da51e086
      Willy Tarreau authored
      [ Upstream commit 271661c1
      
       ]
      
      In order to ease maintenance, this splits the arch-specific code into
      one file per architecture. A common file "arch.h" is used to include the
      right file among arch-* based on the detected architecture. Projects
      which are already split per architecture could simply rename these
      files to $arch/arch.h and get rid of the common arch.h. For this
      reason, include guards were placed into each arch-specific file.
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      da51e086
    • Willy Tarreau's avatar
      tools/nolibc/types: split syscall-specific definitions into their own files · 8591e788
      Willy Tarreau authored
      [ Upstream commit cc7a492a
      
       ]
      
      The macros and type definitions used by a number of syscalls were moved
      to types.h where they will be easier to maintain. A few of them
      are arch-specific and must not be moved there (e.g. O_*, sys_stat_struct).
      A warning about them was placed at the top of the file.
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8591e788
    • Willy Tarreau's avatar
      tools/nolibc/std: move the standard type definitions to std.h · 4fceecde
      Willy Tarreau authored
      [ Upstream commit 967cce19
      
       ]
      
      The ordering of includes and definitions for now is a bit of a mess, as
      for example asm/signal.h is included after int definitions, but plenty of
      structures are defined later as they rely on other includes.
      
      Let's move the standard type definitions to a dedicated file that is
      included first. We also move NULL there. This way all other includes
      are aware of it, and we can bring asm/signal.h back to the top of the
      file.
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4fceecde
    • Willy Tarreau's avatar
      tools/nolibc: use pselect6 on RISCV · 1792136f
      Willy Tarreau authored
      [ Upstream commit 9c2970fb
      
       ]
      
      This arch doesn't provide the old-style select() syscall, we have to
      use pselect6().
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1792136f
    • Ammar Faizi's avatar
      tools/nolibc: x86-64: Use `mov $60,%eax` instead of `mov $60,%rax` · 487386a4
      Ammar Faizi authored
      [ Upstream commit 7bdc0e7a
      
       ]
      
      Note that mov to 32-bit register will zero extend to 64-bit register.
      Thus `mov $60,%eax` has the same effect with `mov $60,%rax`. Use the
      shorter opcode to achieve the same thing.
      ```
        b8 3c 00 00 00       	mov    $60,%eax (5 bytes) [1]
        48 c7 c0 3c 00 00 00 	mov    $60,%rax (7 bytes) [2]
      ```
      Currently, we use [2]. Change it to [1] for shorter code.
      
      Signed-off-by: default avatarAmmar Faizi <ammar.faizi@students.amikom.ac.id>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      487386a4
    • Ammar Faizi's avatar
      tools/nolibc: x86: Remove `r8`, `r9` and `r10` from the clobber list · 27af4f22
      Ammar Faizi authored
      [ Upstream commit bf916669
      
       ]
      
      Linux x86-64 syscall only clobbers rax, rcx and r11 (and "memory").
      
        - rax for the return value.
        - rcx to save the return address.
        - r11 to save the rflags.
      
      Other registers are preserved.
      
      Having r8, r9 and r10 in the syscall clobber list is harmless, but this
      results in a missed-optimization.
      
      As the syscall doesn't clobber r8-r10, GCC should be allowed to reuse
      their value after the syscall returns to userspace. But since they are
      in the clobber list, GCC will always miss this opportunity.
      
      Remove them from the x86-64 syscall clobber list to help GCC generate
      better code and fix the comment.
      
      See also the x86-64 ABI, section A.2 AMD64 Linux Kernel Conventions,
      A.2.1 Calling Conventions [1].
      
      Extra note:
      Some people may think it does not really give a benefit to remove r8,
      r9 and r10 from the syscall clobber list because the impression of
      syscall is a C function call, and function call always clobbers those 3.
      
      However, that is not the case for nolibc.h, because we have a potential
      to inline the "syscall" instruction (which its opcode is "0f 05") to the
      user functions.
      
      All syscalls in the nolibc.h are written as a static function with inline
      ASM and are likely always inline if we use optimization flag, so this is
      a profit not to have r8, r9 and r10 in the clobber list.
      
      Here is the example where this matters.
      
      Consider the following C code:
      ```
        #include "tools/include/nolibc/nolibc.h"
        #define read_abc(a, b, c) __asm__ volatile("nop"::"r"(a),"r"(b),"r"(c))
      
        int main(void)
        {
        	int a = 0xaa;
        	int b = 0xbb;
        	int c = 0xcc;
      
        	read_abc(a, b, c);
        	write(1, "test\n", 5);
        	read_abc(a, b, c);
      
        	return 0;
        }
      ```
      
      Compile with:
          gcc -Os test.c -o test -nostdlib
      
      With r8, r9, r10 in the clobber list, GCC generates this:
      
      0000000000001000 <main>:
          1000:	f3 0f 1e fa          	endbr64
          1004:	41 54                	push   %r12
          1006:	41 bc cc 00 00 00    	mov    $0xcc,%r12d
          100c:	55                   	push   %rbp
          100d:	bd bb 00 00 00       	mov    $0xbb,%ebp
          1012:	53                   	push   %rbx
          1013:	bb aa 00 00 00       	mov    $0xaa,%ebx
          1018:	90                   	nop
          1019:	b8 01 00 00 00       	mov    $0x1,%eax
          101e:	bf 01 00 00 00       	mov    $0x1,%edi
          1023:	ba 05 00 00 00       	mov    $0x5,%edx
          1028:	48 8d 35 d1 0f 00 00 	lea    0xfd1(%rip),%rsi
          102f:	0f 05                	syscall
          1031:	90                   	nop
          1032:	31 c0                	xor    %eax,%eax
          1034:	5b                   	pop    %rbx
          1035:	5d                   	pop    %rbp
          1036:	41 5c                	pop    %r12
          1038:	c3                   	ret
      
      GCC thinks that syscall will clobber r8, r9, r10. So it spills 0xaa,
      0xbb and 0xcc to callee saved registers (r12, rbp and rbx). This is
      clearly extra memory access and extra stack size for preserving them.
      
      But syscall does not actually clobber them, so this is a missed
      optimization.
      
      Now without r8, r9, r10 in the clobber list, GCC generates better code:
      
      0000000000001000 <main>:
          1000:	f3 0f 1e fa          	endbr64
          1004:	41 b8 aa 00 00 00    	mov    $0xaa,%r8d
          100a:	41 b9 bb 00 00 00    	mov    $0xbb,%r9d
          1010:	41 ba cc 00 00 00    	mov    $0xcc,%r10d
          1016:	90                   	nop
          1017:	b8 01 00 00 00       	mov    $0x1,%eax
          101c:	bf 01 00 00 00       	mov    $0x1,%edi
          1021:	ba 05 00 00 00       	mov    $0x5,%edx
          1026:	48 8d 35 d3 0f 00 00 	lea    0xfd3(%rip),%rsi
          102d:	0f 05                	syscall
          102f:	90                   	nop
          1030:	31 c0                	xor    %eax,%eax
          1032:	c3                   	ret
      
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: x86@kernel.org
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarAmmar Faizi <ammar.faizi@students.amikom.ac.id>
      Link: https://gitlab.com/x86-psABIs/x86-64-ABI/-/wikis/x86-64-psABI [1]
      Link: https://lore.kernel.org/lkml/20211011040344.437264-1-ammar.faizi@students.amikom.ac.id/
      
      
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Stable-dep-of: 184177c3
      
       ("tools/nolibc: restore mips branch ordering in the _start block")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      27af4f22
    • Mirsad Goran Todorovac's avatar
      af_unix: selftest: Fix the size of the parameter to connect() · a60b2419
      Mirsad Goran Todorovac authored
      [ Upstream commit 7d6ceeb1
      
       ]
      
      Adjust size parameter in connect() to match the type of the parameter, to
      fix "No such file or directory" error in selftests/net/af_unix/
      test_oob_unix.c:127.
      
      The existing code happens to work provided that the autogenerated pathname
      is shorter than sizeof (struct sockaddr), which is why it hasn't been
      noticed earlier.
      
      Visible from the trace excerpt:
      
      bind(3, {sa_family=AF_UNIX, sun_path="unix_oob_453059"}, 110) = 0
      clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa6a6577a10) = 453060
      [pid <child>] connect(6, {sa_family=AF_UNIX, sun_path="unix_oob_45305"}, 16) = -1 ENOENT (No such file or directory)
      
      BUG: The filename is trimmed to sizeof (struct sockaddr).
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
      Cc: Florian Westphal <fw@strlen.de>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Fixes: 314001f0
      
       ("af_unix: Add OOB support")
      Signed-off-by: default avatarMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a60b2419
    • Minsuk Kang's avatar
      nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame() · 39ae73e5
      Minsuk Kang authored
      [ Upstream commit 9dab880d ]
      
      Fix a use-after-free that occurs in hcd when in_urb sent from
      pn533_usb_send_frame() is completed earlier than out_urb. Its callback
      frees the skb data in pn533_send_async_complete() that is used as a
      transfer buffer of out_urb. Wait before sending in_urb until the
      callback of out_urb is called. To modify the callback of out_urb alone,
      separate the complete function of out_urb and ack_urb.
      
      Found by a modified version of syzkaller.
      
      BUG: KASAN: use-after-free in dummy_timer
      Call Trace:
       memcpy (mm/kasan/shadow.c:65)
       dummy_perform_transfer (drivers/usb/gadget/udc/dummy_hcd.c:1352)
       transfer (drivers/usb/gadget/udc/dummy_hcd.c:1453)
       dummy_timer (drivers/usb/gadget/udc/dummy_hcd.c:1972)
       arch_static_branch (arch/x86/include/asm/jump_label.h:27)
       static_key_false (include/linux/jump_label.h:207)
       timer_expire_exit (include/trace/events/timer.h:127)
       call_timer_fn (kernel/time/timer.c:1475)
       expire_timers (kernel/time/timer.c:1519)
       __run_timers (kernel/time/timer.c:1790)
       run_timer_softirq (kernel/time/timer.c:1803)
      
      Fixes: c46ee386
      
       ("NFC: pn533: add NXP pn533 nfc device driver")
      Signed-off-by: default avatarMinsuk Kang <linuxlovemin@yonsei.ac.kr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      39ae73e5
    • Roger Pau Monne's avatar
      hvc/xen: lock console list traversal · f6003784
      Roger Pau Monne authored
      [ Upstream commit c0dccad8 ]
      
      The currently lockless access to the xen console list in
      vtermno_to_xencons() is incorrect, as additions and removals from the
      list can happen anytime, and as such the traversal of the list to get
      the private console data for a given termno needs to happen with the
      lock held.  Note users that modify the list already do so with the
      lock taken.
      
      Adjust current lock takers to use the _irq{save,restore} helpers,
      since the context in which vtermno_to_xencons() is called can have
      interrupts disabled.  Use the _irq{save,restore} set of helpers to
      switch the current callers to disable interrupts in the locked region.
      I haven't checked if existing users could instead use the _irq
      variant, as I think it's safer to use _irq{save,restore} upfront.
      
      While there switch from using list_for_each_entry_safe to
      list_for_each_entry: the current entry cursor won't be removed as
      part of the code in the loop body, so using the _safe variant is
      pointless.
      
      Fixes: 02e19f9c
      
       ('hvc_xen: implement multiconsole support')
      Signed-off-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Link: https://lore.kernel.org/r/20221130163611.14686-1-roger.pau@citrix.com
      
      
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f6003784
    • Angela Czubak's avatar
      octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable · 79c58b74
      Angela Czubak authored
      [ Upstream commit b4e9b876 ]
      
      PF netdev can request AF to enable or disable reception and transmission
      on assigned CGX::LMAC. The current code instead of disabling or enabling
      'reception and transmission' also disables/enable the LMAC. This patch
      fixes this issue.
      
      Fixes: 1435f66a
      
       ("octeontx2-af: CGX Rx/Tx enable/disable mbox handlers")
      Signed-off-by: default avatarAngela Czubak <aczubak@marvell.com>
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20230105160107.17638-1-hkelam@marvell.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      79c58b74
    • Tung Nguyen's avatar
      tipc: fix unexpected link reset due to discovery messages · 303d0628
      Tung Nguyen authored
      [ Upstream commit c244c092 ]
      
      This unexpected behavior is observed:
      
      node 1                    | node 2
      ------                    | ------
      link is established       | link is established
      reboot                    | link is reset
      up                        | send discovery message
      receive discovery message |
      link is established       | link is established
      send discovery message    |
                                | receive discovery message
                                | link is reset (unexpected)
                                | send reset message
      link is reset             |
      
      It is due to delayed re-discovery as described in function
      tipc_node_check_dest(): "this link endpoint has already reset
      and re-established contact with the peer, before receiving a
      discovery message from that node."
      
      However, commit 598411d7 has changed the condition for calling
      tipc_node_link_down() which was the acceptance of new media address.
      
      This commit fixes this by restoring the old and correct behavior.
      
      Fixes: 598411d7
      
       ("tipc: make resetting of links non-atomic")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      303d0628
    • Takashi Iwai's avatar
      ALSA: usb-audio: Relax hw constraints for implicit fb sync · e79d0f97
      Takashi Iwai authored
      [ Upstream commit d463ac1a ]
      
      The fix commit the commit e4ea77f8 ("ALSA: usb-audio: Always apply
      the hw constraints for implicit fb sync") tried to address the bug
      where an incorrect PCM parameter is chosen when two (implicit fb)
      streams are set up at the same time.  This change had, however, some
      side effect: once when the sync endpoint is chosen and set up, this
      restriction is applied at the next hw params unless it's freed via hw
      free explicitly.
      
      This patch is a workaround for the problem by relaxing the hw
      constraints a bit for the implicit fb sync.  We still keep applying
      the hw constraints for implicit fb sync, but only when the matching
      sync EP is being used by other streams.
      
      Fixes: e4ea77f8
      
       ("ALSA: usb-audio: Always apply the hw constraints for implicit fb sync")
      Reported-by: default avatarRuud van Asseldonk <ruud@veniogames.com>
      Link: https://lore.kernel.org/r/4e509aea-e563-e592-e652-ba44af6733fe@veniogames.com
      Link: https://lore.kernel.org/r/20230102170759.29610-3-tiwai@suse.de
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e79d0f97
    • Takashi Iwai's avatar
      ALSA: usb-audio: Make sure to stop endpoints before closing EPs · c9557906
      Takashi Iwai authored
      [ Upstream commit 0599313e ]
      
      At the PCM hw params, we may re-configure the endpoints and it's done
      by a temporary EP close followed by re-open.  A potential problem
      there is that the EP might be already running internally at the PCM
      prepare stage; it's seen typically in the playback stream with the
      implicit feedback sync.  As this stream start isn't tracked by the
      core PCM layer, we'd need to stop it explicitly, and that's the
      missing piece.
      
      This patch adds the stop_endpoints() call at snd_usb_hw_params() to
      assure the stream stop before closing the EPs.
      
      Fixes: bf6313a0 ("ALSA: usb-audio: Refactor endpoint management")
      Link: https://lore.kernel.org/r/4e509aea-e563-e592-e652-ba44af6733fe@veniogames.com
      Link: https://lore.kernel.org/r/20230102170759.29610-2-tiwai@suse.de
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c9557906
    • Emanuele Ghidoli's avatar
      ASoC: wm8904: fix wrong outputs volume after power reactivation · 83e75810
      Emanuele Ghidoli authored
      [ Upstream commit 472a6309 ]
      
      Restore volume after charge pump and PGA activation to ensure
      that volume settings are correctly applied when re-enabling codec
      from SND_SOC_BIAS_OFF state.
      CLASS_W, CHARGE_PUMP and POWER_MANAGEMENT_2 register configuration
      affect how the volume register are applied and must be configured first.
      
      Fixes: a91eb199 ("ASoC: Initial WM8904 CODEC driver")
      Link: https://lore.kernel.org/all/c7864c35-738c-a867-a6a6-ddf9f98df7e7@gmail.com/
      
      
      Signed-off-by: default avatarEmanuele Ghidoli <emanuele.ghidoli@toradex.com>
      Signed-off-by: default avatarFrancesco Dolcini <francesco.dolcini@toradex.com>
      Acked-by: default avatarCharles Keepax <ckeepax@opensource.cirrus.com>
      Link: https://lore.kernel.org/r/20221223080247.7258-1-francesco@dolcini.it
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      83e75810
    • Peter Wang's avatar
      scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery · 7c26d218
      Peter Wang authored
      [ Upstream commit 1a5665fc ]
      
      When SSU/enter hibern8 fail in WLUN suspend flow, trigger the error handler
      and return busy to break the suspend.  Otherwise the consumer will get
      stuck in runtime suspend status.
      
      Fixes: b294ff3e
      
       ("scsi: ufs: core: Enable power management for wlun")
      Signed-off-by: default avatarPeter Wang <peter.wang@mediatek.com>
      Link: https://lore.kernel.org/r/20221208072520.26210-1-peter.wang@mediatek.com
      
      
      Reviewed-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c26d218
    • Bart Van Assche's avatar
      scsi: ufs: Stop using the clock scaling lock in the error handler · 513fdf0b
      Bart Van Assche authored
      [ Upstream commit 5675c381 ]
      
      Instead of locking and unlocking the clock scaling lock, surround the
      command queueing code with an RCU reader lock and call synchronize_rcu().
      This patch prepares for removal of the clock scaling lock.
      
      Link: https://lore.kernel.org/r/20211203231950.193369-16-bvanassche@acm.org
      
      
      Tested-by: default avatarBean Huo <beanhuo@micron.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Stable-dep-of: 1a5665fc
      
       ("scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      513fdf0b
    • Shin'ichiro Kawasaki's avatar
      scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile · 13259b60
      Shin'ichiro Kawasaki authored
      [ Upstream commit f0a43ba6 ]
      
      When Kconfig item CONFIG_SCSI_MPI3MR was introduced for mpi3mr driver, the
      Makefile of the driver was not modified to refer the Kconfig item.
      
      As a result, mpi3mr.ko is built regardless of the Kconfig item value y or
      m. Also, if 'make localmodconfig' can not find the Kconfig item in the
      Makefile, then it does not generate CONFIG_SCSI_MPI3MR=m even when
      mpi3mr.ko is loaded on the system.
      
      Refer to the Kconfig item to avoid the issues.
      
      Fixes: c4f7ac64
      
       ("scsi: mpi3mr: Add mpi30 Rev-R headers and Kconfig")
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Link: https://lore.kernel.org/r/20221207023659.2411785-1-shinichiro.kawasaki@wdc.com
      
      
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Acked-by: default avatarSathya Prakash Veerichetty <sathya.prakash@broadcom.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      13259b60
    • Ricardo Ribalda's avatar
      regulator: da9211: Use irq handler when ready · 470f6a91
      Ricardo Ribalda authored
      [ Upstream commit 02228f6a
      
       ]
      
      If the system does not come from reset (like when it is kexec()), the
      regulator might have an IRQ waiting for us.
      
      If we enable the IRQ handler before its structures are ready, we crash.
      
      This patch fixes:
      
      [    1.141839] Unable to handle kernel read from unreadable memory at virtual address 0000000000000078
      [    1.316096] Call trace:
      [    1.316101]  blocking_notifier_call_chain+0x20/0xa8
      [    1.322757] cpu cpu0: dummy supplies not allowed for exclusive requests
      [    1.327823]  regulator_notifier_call_chain+0x1c/0x2c
      [    1.327825]  da9211_irq_handler+0x68/0xf8
      [    1.327829]  irq_thread+0x11c/0x234
      [    1.327833]  kthread+0x13c/0x154
      
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Reviewed-by: default avatarAdam Ward <DLG-Adam.Ward.opensource@dm.renesas.com>
      Link: https://lore.kernel.org/r/20221124-da9211-v2-0-1779e3c5d491@chromium.org
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      470f6a91
    • Peter Newman's avatar
      x86/resctrl: Fix task CLOSID/RMID update race · 24107ad4
      Peter Newman authored
      commit fe1f0714 upstream.
      
      When the user moves a running task to a new rdtgroup using the task's
      file interface or by deleting its rdtgroup, the resulting change in
      CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
      task(s) CPUs.
      
      x86 allows reordering loads with prior stores, so if the task starts
      running between a task_curr() check that the CPU hoisted before the
      stores in the CLOSID/RMID update then it can start running with the old
      CLOSID/RMID until it is switched again because __rdtgroup_move_task()
      failed to determine that it needs to be interrupted to obtain the new
      CLOSID/RMID.
      
      Refer to the diagram below:
      
      CPU 0                                   CPU 1
      -----                                   -----
      __rdtgroup_move_task():
        curr <- t1->cpu->rq->curr
                                              __schedule():
                                                rq->curr <- t1
                                              resctrl_sched_in():
                                                t1->{closid,rmid} -> {1,1}
        t1->{closid,rmid} <- {2,2}
        if (curr == t1) // false
         IPI(t1->cpu)
      
      A similar race impacts rdt_move_group_tasks(), which updates tasks in a
      deleted rdtgroup.
      
      In both cases, use smp_mb() to order the task_struct::{closid,rmid}
      stores before the loads in task_curr().  In particular, in the
      rdt_move_group_tasks() case, simply execute an smp_mb() on every
      iteration with a matching task.
      
      It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
      this would require two passes and a means of remembering which
      task_structs were updated in the first loop. However, benchmarking
      results below showed too little performance impact in the simple
      approach to justify implementing the two-pass approach.
      
      Times below were collected using `perf stat` to measure the time to
      remove a group containing a 1600-task, parallel workload.
      
      CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)
      
        # mkdir /sys/fs/resctrl/test
        # echo $$ > /sys/fs/resctrl/test/tasks
        # perf bench sched messaging -g 40 -l 100000
      
      task-clock time ranges collected using:
      
        # perf stat rmdir /sys/fs/resctrl/test
      
      Baseline:                     1.54 - 1.60 ms
      smp_mb() every matching task: 1.57 - 1.67 ms
      
        [ bp: Massage commit message. ]
      
      Fixes: ae28d1aa ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
      Fixes: 0efc89be
      
       ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
      Signed-off-by: default avatarPeter Newman <peternewman@google.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Reviewed-by: default avatarBabu Moger <babu.moger@amd.com>
      Cc: <stable@kernel.org>
      Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24107ad4
    • Eliav Farber's avatar
      EDAC/device: Fix period calculation in edac_device_reset_delay_period() · cd3da505
      Eliav Farber authored
      commit e8407743 upstream.
      
      Fix period calculation in case user sets a value of 1000.  The input of
      round_jiffies_relative() should be in jiffies and not in milli-seconds.
      
        [ bp: Use the same code pattern as in edac_device_workq_setup() for
          clarity. ]
      
      Fixes: c4cf3b45
      
       ("EDAC: Rework workqueue handling")
      Signed-off-by: default avatarEliav Farber <farbere@amazon.com>
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Cc: <stable@kernel.org>
      Link: https://lore.kernel.org/r/20221020124458.22153-1-farbere@amazon.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd3da505
    • Peter Zijlstra's avatar
      x86/boot: Avoid using Intel mnemonics in AT&T syntax asm · ab0d02c5
      Peter Zijlstra authored
      commit 7c6dd961 upstream.
      
      With 'GNU assembler (GNU Binutils for Debian) 2.39.90.20221231' the
      build now reports:
      
        arch/x86/realmode/rm/../../boot/bioscall.S: Assembler messages:
        arch/x86/realmode/rm/../../boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
        arch/x86/realmode/rm/../../boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
      
        arch/x86/boot/bioscall.S: Assembler messages:
        arch/x86/boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
        arch/x86/boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
      
      Which is due to:
      
        PR gas/29525
      
        Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
        templates taking operands, mixed IsString/non-IsString template groups
        (with memory operands) cannot occur anymore. With that
        maybe_adjust_templates() becomes unnecessary (and is hence being
        removed).
      
      More details: https://sourceware.org/bugzilla/show_bug.cgi?id=29525
      
      Borislav Petkov further explains:
      
        " the particular problem here is is that the 'd' suffix is
          "conflicting" in the sense that you can have SSE mnemonics like movsD %xmm...
          and the same thing also for string ops (which is the case here) so apparently
          the agreement in binutils land is to use the always accepted suffixes 'l' or 'q'
          and phase out 'd' slowly... "
      
      Fixes: 7a734e7d
      
       ("x86, setup: "glove box" BIOS calls -- infrastructure")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Link: https://lore.kernel.org/r/Y71I3Ex2pvIxMpsP@hirez.programming.kicks-ass.net
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab0d02c5
    • Kajol Jain's avatar
      powerpc/imc-pmu: Fix use of mutex in IRQs disabled section · a90d339f
      Kajol Jain authored
      commit 76d588dd upstream.
      
      Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP
      and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event.
      
      Command to trigger the warning:
        # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5
      
         Performance counter stats for 'sleep 5':
      
                         0      thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/
      
               5.002117947 seconds time elapsed
      
               0.000131000 seconds user
               0.001063000 seconds sys
      
      Below is snippet of the warning in dmesg:
      
        BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
        in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec
        preempt_count: 2, expected: 0
        4 locks held by perf-exec/2869:
         #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90
         #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0
         #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510
         #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510
        irq event stamp: 4806
        hardirqs last  enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0
        hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510
        softirqs last  enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0
        softirqs last disabled at (0): [<0000000000000000>] 0x0
        CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        Call Trace:
          dump_stack_lvl+0x98/0xe0 (unreliable)
          __might_resched+0x2f8/0x310
          __mutex_lock+0x6c/0x13f0
          thread_imc_event_add+0xf4/0x1b0
          event_sched_in+0xe0/0x210
          merge_sched_in+0x1f0/0x600
          visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0
          ctx_flexible_sched_in+0xcc/0x140
          ctx_sched_in+0x20c/0x2a0
          ctx_resched+0x104/0x1c0
          perf_event_exec+0x340/0x510
          begin_new_exec+0x730/0xef0
          load_elf_binary+0x3f8/0x1e10
        ...
        do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0
        WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0
        CPU: 36 PID: 2869 Comm: sleep Tainted: G        W          6.2.0-rc2-00011-g1247637727f2 #61
        Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
        NIP:  c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670
        REGS: c00000004d2134e0 TRAP: 0700   Tainted: G        W           (6.2.0-rc2-00011-g1247637727f2)
        MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48002824  XER: 00000000
        CFAR: c00000000013fb64 IRQMASK: 1
      
      The above warning triggered because the current imc-pmu code uses mutex
      lock in interrupt disabled sections. The function mutex_lock()
      internally calls __might_resched(), which will check if IRQs are
      disabled and in case IRQs are disabled, it will trigger the warning.
      
      Fix the issue by changing the mutex lock to spinlock.
      
      Fixes: 8f95faaa
      
       ("powerpc/powernv: Detect and create IMC device")
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      [mpe: Fix comments, trim oops in change log, add reported-by tags]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20230106065157.182648-1-kjain@linux.ibm.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a90d339f
    • Gavrilov Ilia's avatar
      netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function. · 511cf17b
      Gavrilov Ilia authored
      commit 9ea4b476 upstream.
      
      When first_ip is 0, last_ip is 0xFFFFFFFF, and netmask is 31, the value of
      an arithmetic expression 2 << (netmask - mask_bits - 1) is subject
      to overflow due to a failure casting operands to a larger data type
      before performing the arithmetic.
      
      Note that it's harmless since the value will be checked at the next step.
      
      Found by InfoTeCS on behalf of Linux Verification Center
      (linuxtesting.org) with SVACE.
      
      Fixes: b9fed748
      
       ("netfilter: ipset: Check and reject crazy /0 input parameters")
      Signed-off-by: default avatarIlia.Gavrilov <Ilia.Gavrilov@infotecs.ru>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      511cf17b